Efficient Dependency Graph Matching with the IMS Open Corpus Workbench

Beitrag bei einer Tagung
(Konferenzbeitrag)


Details zur Publikation

Autorinnen und Autoren: Proisl T, Uhrig P
Herausgeber: Calzolari Nicoletta, Choukri Khalid, Declerck Thierry, Doğan Mehmet Uğur, Maegaard Bente, Mariani Joseph, Moreno Asuncion, Odijk Jan, Piperidis Stelios
Verlag: European Language Resources Association (ELRA)
Verlagsort: Istanbul
Jahr der Veröffentlichung: 2012
Tagungsband: Proceedings
Seitenbereich: 2750–2756
ISBN: 978-2-9517408-7-7
Sprache: Englisch


Abstract


State-of-the-art dependency representations such as the Stanford Typed Dependencies may represent the grammatical relations in a sentence as directed, possibly cyclic graphs. Querying a syntactically annotated corpus for grammatical structures that are represented as graphs requires graph matching, which is a non-trivial task. In this paper, we present an algorithm for graph matching that is tailored to the properties of large, syntactically annotated corpora. The implementation of the algorithm is built on top of the popular IMS Open Corpus Workbench, allowing corpus linguists to re-use existing infrastructure. An evaluation of the resulting software, CWB-treebank, shows that its performance in real world applications, such as a web query interface, compares favourably to implementations that rely on a relational database or a dedicated graph database while at the same time offering a greater expressive power for queries. An intuitive graphical interface for building the query graphs is available via the Treebank.info project.


FAU-Autorinnen und Autoren / FAU-Herausgeberinnen und Herausgeber

Proisl, Thomas
Lehrstuhl für Anglistik, insbesondere Linguistik
Uhrig, Peter Dr.
Lehrstuhl für Anglistik, insbesondere Linguistik


Forschungsbereiche

Korpuswerkzeuge und sprachtechnologische Anwendungen
Lehrstuhl für Korpus- und Computerlinguistik


Zitierweisen

APA:
Proisl, T., & Uhrig, P. (2012). Efficient Dependency Graph Matching with the IMS Open Corpus Workbench. In Calzolari Nicoletta, Choukri Khalid, Declerck Thierry, Doğan Mehmet Uğur, Maegaard Bente, Mariani Joseph, Moreno Asuncion, Odijk Jan, Piperidis Stelios (Eds.), Proceedings (pp. 2750–2756). Istanbul, TR: Istanbul: European Language Resources Association (ELRA).

MLA:
Proisl, Thomas, and Peter Uhrig. "Efficient Dependency Graph Matching with the IMS Open Corpus Workbench." Proceedings of the 8th International Conference on Language Resources and Evaluation (LREC'12), Istanbul Ed. Calzolari Nicoletta, Choukri Khalid, Declerck Thierry, Doğan Mehmet Uğur, Maegaard Bente, Mariani Joseph, Moreno Asuncion, Odijk Jan, Piperidis Stelios, Istanbul: European Language Resources Association (ELRA), 2012. 2750–2756.

BibTeX: 

Zuletzt aktualisiert 2018-13-11 um 09:51