A Proposal for a Part-of-Speech Tagset for the Albanian Language

Conference contribution
(Conference Contribution)


Publication Details

Author(s): Kabashi B, Proisl T
Editor(s): Calzolari Nicoletta, Choukri Khalid, Declerck Thierry, Grobelnik Marko, Maegaard Bente, Mariani Joseph, Moreno Asuncion, Odijk Jan, Piperidis Stelios
Publisher: European Language Resources Association (ELRA)
Publishing place: Paris
Publication year: 2016
Conference Proceedings Title: Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC 2016)
Pages range: 4305–4310
ISBN: 978-2-9517408-9-1
Language: English


Abstract


Part-of-speech tagging is a basic step in Natural Language Processing that is often essential. Labeling the word forms of a text with fine-grained word-class information adds new value to it and can be a prerequisite for downstream processes like a dependency parser. Corpus linguists and lexicographers also benefit greatly from the improved search options that are available with tagged data.



The Albanian language has some properties that pose difficulties for the creation of a part-of-speech tagset. In this paper, we discuss those difficulties and present a proposal for a part-of-speech tagset that can adequately represent the underlying linguistic phenomena.


FAU Authors / FAU Editors

Kabashi, Besim Dr.
Lehrstuhl für Korpus- und Computerlinguistik
Proisl, Thomas
Lehrstuhl für Korpus- und Computerlinguistik


Research Fields

Corpus tools and language technology
Lehrstuhl für Korpus- und Computerlinguistik
Further research
Lehrstuhl für Korpus- und Computerlinguistik


How to cite

APA:
Kabashi, B., & Proisl, T. (2016). A Proposal for a Part-of-Speech Tagset for the Albanian Language. In Calzolari Nicoletta, Choukri Khalid, Declerck Thierry, Grobelnik Marko, Maegaard Bente, Mariani Joseph, Moreno Asuncion, Odijk Jan, Piperidis Stelios (Eds.), Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC 2016) (pp. 4305–4310). Portorož, SI: Paris: European Language Resources Association (ELRA).

MLA:
Kabashi, Besim, and Thomas Proisl. "A Proposal for a Part-of-Speech Tagset for the Albanian Language." Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC 2016), Portorož Ed. Calzolari Nicoletta, Choukri Khalid, Declerck Thierry, Grobelnik Marko, Maegaard Bente, Mariani Joseph, Moreno Asuncion, Odijk Jan, Piperidis Stelios, Paris: European Language Resources Association (ELRA), 2016. 4305–4310.

BibTeX: 

Last updated on 2018-13-11 at 09:52