Lehrstuhl für Korpus- und Computerlinguistik


The computational corpus linguistics group carries out foundational methodological research on the quantitative analysis of large text corpora. The algorithms and software tools developed by the group support innovative studies in the digital humanities and social sciences as well as practical applications in language technology. A particular focus lies on understanding cooccurrence phenomena and their application in corpus-based discourse analysis.

Bismarckstraße 6
91054 Erlangen

Research Fields

Collocations, multiword expressions and corpus-based discourse analysis
Corpus tools and language technology
Further research
Methodological foundations of corpus research and digital humanities

Related Project(s)

Go to first page Go to previous page 1 of 2 Go to next page Go to last page

RANT: Reconstructing Arguments from Noisy Text (DFG Priority Programme 1999: RATIO)
Prof. Dr. Stefan Evert
(01/01/2018 - 31/12/2020)

(KALLIMACHOS – Centre for digital editions and quantitative analysis at the University of Würzburg):
KALLIMACHOS II: Measures of linguistic complexity for literary stylometry in the KALLIMACHOS Centre for Digital Humanities
Prof. Dr. Stefan Evert
(01/10/2017 - 30/09/2019)

EFE: Exploring the “Fukushima Effect”: Attitudes and opinions towards nuclear power and renewable energy and the emergence of a transnational algorithmic public sphere
Prof. Dr. Stefan Evert
(01/01/2017 - 31/12/2019)

E-SPar: Efficient simulation experiments for large-scale parameter optimisation of machine learning approaches in natural language processing
Prof. Dr. Stefan Evert
(01/10/2016 - 30/09/2017)

Englisches Konstruktikon
Prof. Dr. Stefan Evert; Prof. Dr. Thomas Herbst

Publications (Download BibTeX)

Go to first page Go to previous page 2 of 5 Go to next page Go to last page

Lapesa, G., & Evert, S. (2017). Large-scale evaluation of dependency-based DSMs: Are they worth the effort? In Proceedings of the 15th Annual Meeting of the European Association for Computational Linguistics (EACL 2017): Volume 2, Short Papers (pp. 394-400). Valencia, Spain.
Evert, S., Wankerl, S., & Nöth, E. (2017). Reliable measures of syntactic and lexical complexity: The case of Iris Murdoch. Paper presentation, Birmingham, GB.
Evert, S., & Neumann, S. (2017). The impact of translation direction on characteristics of translated texts. A multivariate analysis for English and German. In De Sutter G, Lefer M, Delaere I (Eds.), Empirical Translation Studies. New Theoretical and Methodological Traditions (pp. 47-80). Berlin: Mouton de Gruyter.
Evert, S., Uhrig, P., Bartsch, S., & Proisl, T. (2017). E-VIEW-Alation – a Large-Scale Evaluation Study of Association Measures for Collocation Identification. In Iztok K, Carole T, Miloš J, Jelena K, Simon K, and Vít B (Eds.), Electronic Lexicography in the 21st Century. Proceedings of the eLex 2017 Conference (pp. 531–549). Leiden, NL: Brno: Lexical Computing.
Evert, S., Jannidis, F., Dimpel, F.M., Schöch, C., Pielström, S., Vitt, T.,... Proisl, T. (2016). „Delta“ in der stilometrischen Autorschaftsattribution. Paper presentation at DHd 2016, Leipzig, DE.
Proisl, T., & Uhrig, P. (2016). SoMaJo: State-of-the-art tokenization for German web and social media texts. In Cook P, Evert S, Schäfer R, Stemle E (Eds.), Proceedings of the 10th Web as Corpus Workshop (WAC-X) and the EmpiriST Shared Task (pp. 57-62). Berlin, DE: Berlin: Association for Computational Linguistics (ACL).
Santus, E., Gladkova, A., Evert, S., & Lenci, A. (2016). The CogALex-V Shared Task on the Corpus-Based Identification of Semantic Relations. In Proceedings of the 5th Workshop on Cognitive Aspects of the Lexicon (CogALex-V) (pp. 69-79). Osaka, Japan.
Kabashi, B., & Proisl, T. (2016). A Proposal for a Part-of-Speech Tagset for the Albanian Language. In Calzolari Nicoletta, Choukri Khalid, Declerck Thierry, Grobelnik Marko, Maegaard Bente, Mariani Joseph, Moreno Asuncion, Odijk Jan, Piperidis Stelios (Eds.), Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC 2016) (pp. 4305–4310). Portorož, SI: Paris: European Language Resources Association (ELRA).
Evert, S. (2016). CogALex-V Shared Task: Mach5 – A traditional DSM approach to semantic relatedness. In Proceedings of the 5th Workshop on Cognitive Aspects of the Lexicon (CogALex-V) (pp. 92-97). Osaka, Japan.
Wankerl, S., Nöth, E., & Evert, S. (2016). An Analysis of Perplexity to Reveal the Effects of Alzheimer's Disease on Language. In ITG-Fachbericht 267: Speech Communication (pp. 254-259). Paderborn, Germany.
Evert, S., Beißwenger, M., Bartsch, S., & Würzner, K.-M. (2016). EmpiriST 2015: A Shared Task on the Automatic Linguistic Annotation of Computer-Mediated Communication and Web Corpora. In Proceedings of the 10th Web as Corpus Workshop (WAC-X) and the EmpiriST Shared Task (pp. 44-56). Berlin, DE: Berlin, Germany.
Evert, S., Greiner, P., Baigger, F., & Lang, B. (2016). A Distributional Approach to Open Questions in Market Research. Computers in Industry, 78, 16-28. https://dx.doi.org/10.1016/j.compind.2015.10.008
Evert, S., & Arppe, A. (2015). Some theoretical and experimental observations on naïve discriminative learning. In Proceedings of the 6th Conference on Quantitative Investigations in Theoretical Linguistics (QITL-6). Tübingen, Germany.
Plotnikova, N., Lapesa, G., Proisl, T., & Evert, S. (2015). SemantiKLUE: Semantic Textual Similarity with Maximum Weight Matching. In Proceedings of the 9th International Workshop on Semantic Evaluation (SemEval 2015) (pp. 111--116). Denver, Colorado.
Evert, S., & Hardie, A. (2015). Ziggurat: A new data model and indexing format for large annotated text corpora. In Proceedings of the 3rd Workshop on the Challenges in the Management of Large Corpora (CMLC-3) (pp. 21--27). Lancaster, UK.
Evert, S., Proisl, T., Jannidis, F., Pielström, S., Schöch, C., & Vitt, T. (2015). Towards a better understanding of Burrows's Delta in literary authorship attribution. In Proceedings of the Fourth Workshop on Computational Linguistics for Literature (pp. 79--88). Denver, CO.
Plotnikova, N., Kohl, M., Volkert, K., Lerner, A., Dykes, N., Ermer, H., & Evert, S. (2015). KLUEless: Polarity Classification and Association. In Proceedings of the 9th International Workshop on Semantic Evaluation (SemEval 2015) (pp. 619--625). Denver, Colorado.
Kabashi, B. (2015). Automatische Verarbeitung der Morphologie des Albanischen. Erlangen: FAU University Press.
Bartsch, S., & Evert, S. (2014). Towards a Firthian Notion of Collocation. In Abel A, Lemnitzer L (Eds.), Vernetzungsstrategien, Zugriffsstrukturen und automatisch ermittelte Angaben in Internetwörterbüchern (pp. 48–61). Mannheim: Institut für Deutsche Sprache.
Diwersy, S., Evert, S., & Neumann, S. (2014). A weakly supervised multivariate approach to the study of language variation. In Szmrecsanyi B, Wälchli B (Eds.), Aggregating Dialectology, Typology, and Register Analysis. Linguistic Variation in Text and Speech (pp. 174–204). Berlin, Boston: De Gruyter.

Last updated on 2019-24-04 at 10:19