A Proposal for a Part-of-Speech Tagset for the Albanian Language

Kabashi B, Proisl T (2016)


Publication Language: English

Publication Type: Conference contribution, Conference Contribution

Publication year: 2016

Publisher: European Language Resources Association (ELRA)

City/Town: Paris

Pages Range: 4305–4310

Conference Proceedings Title: Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC 2016)

Event location: Portorož SI

ISBN: 978-2-9517408-9-1

URI: http://www.lrec-conf.org/proceedings/lrec2016/pdf/1066_Paper.pdf

Open Access Link: http://www.lrec-conf.org/proceedings/lrec2016/pdf/1066_Paper.pdf

Abstract

Part-of-speech tagging is a basic step in Natural Language Processing that is often essential. Labeling the word forms of a text with fine-grained word-class information adds new value to it and can be a prerequisite for downstream processes like a dependency parser. Corpus linguists and lexicographers also benefit greatly from the improved search options that are available with tagged data.

The Albanian language has some properties that pose difficulties for the creation of a part-of-speech tagset. In this paper, we discuss those difficulties and present a proposal for a part-of-speech tagset that can adequately represent the underlying linguistic phenomena.

Authors with CRIS profile

How to cite

APA:

Kabashi, B., & Proisl, T. (2016). A Proposal for a Part-of-Speech Tagset for the Albanian Language. In Calzolari Nicoletta, Choukri Khalid, Declerck Thierry, Grobelnik Marko, Maegaard Bente, Mariani Joseph, Moreno Asuncion, Odijk Jan, Piperidis Stelios (Eds.), Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC 2016) (pp. 4305–4310). Portorož, SI: Paris: European Language Resources Association (ELRA).

MLA:

Kabashi, Besim, and Thomas Proisl. "A Proposal for a Part-of-Speech Tagset for the Albanian Language." Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC 2016), Portorož Ed. Calzolari Nicoletta, Choukri Khalid, Declerck Thierry, Grobelnik Marko, Maegaard Bente, Mariani Joseph, Moreno Asuncion, Odijk Jan, Piperidis Stelios, Paris: European Language Resources Association (ELRA), 2016. 4305–4310.

BibTeX: Download