Towards an Annotation Standard for STEM Documents: Datasets, Benchmarks, and Spotters

Schaefer JF, Kohlhase M (2023)


Publication Type: Conference contribution

Publication year: 2023

Journal

Publisher: Springer Science and Business Media Deutschland GmbH

Book Volume: 14101 LNAI

Pages Range: 190-205

Conference Proceedings Title: Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

Event location: Cambridge, GBR

ISBN: 9783031427527

DOI: 10.1007/978-3-031-42753-4_13

Abstract

When publishing papers, researchers in mathematics and related disciplines typically focus on the presentation, i.e. type-setting, of their ideas and provide little semantic information. This impedes the development of services that benefit from semantic information, such as semantic search and screen readers for vision-impaired researchers. As a remedy, there have been attempts to infer semantic data from already published papers using small programs that we call spotters. Unfortunately, there is no standardized format for semantic annotations and spotter authors typically invent their own format. This leads to two problems: i) there is no ecosystem of tools for common tasks like the visualization of results or the manual annotation of a gold standard, and ii) re-using, evaluating and combining results becomes very difficult. In this paper, we address these issues by describing a standardized, flexible way to represent semantic annotations, using semantic web technologies and, in particular, the Web Annotation standard. Furthermore, we describe SpotterBase, a set of tools to help with processing the annotations and creating new ones.

Authors with CRIS profile

How to cite

APA:

Schaefer, J.F., & Kohlhase, M. (2023). Towards an Annotation Standard for STEM Documents: Datasets, Benchmarks, and Spotters. In Catherine Dubois, Manfred Kerber (Eds.), Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (pp. 190-205). Cambridge, GBR: Springer Science and Business Media Deutschland GmbH.

MLA:

Schaefer, Jan Frederik, and Michael Kohlhase. "Towards an Annotation Standard for STEM Documents: Datasets, Benchmarks, and Spotters." Proceedings of the Proceedings of the 16th Conference on Intelligent Computer Mathematics, CICM 2023, Cambridge, GBR Ed. Catherine Dubois, Manfred Kerber, Springer Science and Business Media Deutschland GmbH, 2023. 190-205.

BibTeX: Download