Lehrstuhl für Korpus- und Computerlinguistik


Description:

The computational corpus linguistics group carries out foundational methodological research on the quantitative analysis of large text corpora. The algorithms and software tools developed by the group support innovative studies in the digital humanities and social sciences as well as practical applications in language technology. A particular focus lies on understanding cooccurrence phenomena and their application in corpus-based discourse analysis.

Address:
Bismarckstraße 6
91054 Erlangen


Research Fields

Collocations, multiword expressions and corpus-based discourse analysis
Corpus tools and language technology
Further research
Methodological foundations of corpus research and digital humanities


Related Project(s)

Go to first page Go to previous page 1 of 2 Go to next page Go to last page

RANT: Reconstructing Arguments from Noisy Text (DFG Priority Programme 1999: RATIO)
Prof. Dr. Stefan Evert
(01/01/2018 - 31/12/2020)


(KALLIMACHOS – Centre for digital editions and quantitative analysis at the University of Würzburg):
KALLIMACHOS II: Measures of linguistic complexity for literary stylometry in the KALLIMACHOS Centre for Digital Humanities
Prof. Dr. Stefan Evert
(01/10/2017 - 30/09/2019)


EFE: Exploring the “Fukushima Effect”: Attitudes and opinions towards nuclear power and renewable energy and the emergence of a transnational algorithmic public sphere
Prof. Dr. Stefan Evert
(01/01/2017 - 31/12/2019)


E-SPar: Efficient simulation experiments for large-scale parameter optimisation of machine learning approaches in natural language processing
Prof. Dr. Stefan Evert
(01/10/2016 - 30/09/2017)


Englisches Konstruktikon
Prof. Dr. Stefan Evert; Prof. Dr. Thomas Herbst
(01/01/2016)



Publications (Download BibTeX)

Go to first page Go to previous page 1 of 5 Go to next page Go to last page

Peters, J., Dykes, N., Habermann, M., Ostgathe, C., & Heckel, M. (2019). Metaphors for multidrug-resistant bacteria in German newspaper articles, 1995-2015. A computer-assisted qualitative study. Metaphor and the Social World, 9(2), 221-241.
Evert, S., Heinrich, P., Henselmann, K., Rabenstein, U., Scherr, E., Schmitt, M., & Schröder, L. (2019). Combining Machine Learning and Semantic Features in the Classification of Corporate Disclosures. Journal of Logic, Language and Information. https://dx.doi.org/10.1007/s10849-019-09283-6
Proisl, T. (2019). The cooccurrence of linguistic structures. Erlangen: FAU University Press.
Dimpel, F.M., & Proisl, T. (2019). Gute Wörter für Delta: Verbesserung der Autorschaftsattribution durch autorspezifische distinktive Wörter. In Patrick Sahle (Eds.), DHd 2019. Digital Humanities: multimedial & multimodal. Konferenzabstracts. (pp. 296–299).
Peters, J., Dykes, N., Heckel, M., Ostgathe, C., & Habermann, M. (2019). A Linguistic Model of Communication Types in Palliative Medicine: Effects of Multidrug-Resistant Organisms (MDRO) Colonization or Infection and Isolation Measures in End of Life on Family Caregivers’ Knowledge, Attitude and Practices. Journal of Palliative Medicine, 22(8). https://dx.doi.org/10.1089/jpm.2019.0027
Proisl, T., Heinrich, P., Kabashi, B., & Evert, S. (2018). EmotiKLUE at IEST 2018: Topic-Informed Classification of Implicit Emotions. In Balahur A, Mohammad SM, Hoste V, Klinger R (Eds.), Proceedings of the 9th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis (pp. 235–242). Brüssel, BE: Brussels: Association for Computational Linguistics.
Heinrich, P. (2018). Stylistic Features in Corporate Disclosures and their Predictive Power. In Yukio Tono & Hitoshi Isahara (Eds.), Proceedings of 4th Asia Pacific Corpus Linguistics Conference (APCLC2018) (pp. 129 - 134). Takamatsu, JP.
Heinrich, P., & Schäfer, F. (2018). Extending Corpus-Based Discourse Analysis for Exploring Japanese Social Media. In Yukio Tono & Hitoshi Isahara (Eds.), Proceedings of 4th Asia Pacific Corpus Linguistics Conference (APCLC2018) (pp. 135 - 140). Takamatsu, JP.
Kabashi, B., & Proisl, T. (2018). Albanian Part-of-Speech Tagging: Gold Standard and Evaluation. In Calzolari N, Choukri K, Cieri C, Declerck T, Goggi S, Hasida K, Isahara H, Maegaard B, Mariani J, Mazo H, Moreno A, Odijk J, Piperidis S, Tokunaga T (Eds.), Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018) (pp. 2593–2599). Miyazaki, JP: Miyazaki: European Language Resources Association.
Proisl, T. (2018). SoMeWeTa: A Part-of-Speech Tagger for German Social Media and Web Texts. In Calzolari N, Choukri K, Cieri C, Declerck T, Goggi S, Hasida K, Isahara H, Maegaard B, Mariani J, Mazo H, Moreno A, Odijk J, Piperidis S, Tokunaga T (Eds.), Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018) (pp. 665–670). Miyazaki, JP: Miyazaki: European Language Resources Association.
Heinrich, P., Adrian, C., Kalashnikova, O., Schäfer, F., & Evert, S. (2018). A Transnational Analysis of News and Tweets about Nuclear Phase-Out in the Aftermath of the Fukushima Incident. In Andreas Witt, Jana Diesner, Georg Rehm (Eds.), Proceedings of the LREC 2018 “Workshop on Computational Impact Detection from Text Data” (pp. 8 - 16). Miyazaki, JP: Paris: ELRA.
Proisl, T., Evert, S., Jannidis, F., Schöch, C., Konle, L., & Pielström, S. (2018). Delta vs. N-Gram Tracing: Evaluating the Robustness of Authorship Attribution Methods. In Calzolari N, Choukri K, Cieri C, Declerck T, Goggi S, Hasida K, Isahara H, Maegaard B, Mariani J, Mazo H, Moreno A, Odijk J, Piperidis S, Tokunaga T (Eds.), Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018) (pp. 3309–3314). Miyazaki, JP: Miyazaki: European Language Resources Association.
Evert, S., Dykes, N., & Peters, J. (2018). A quantitative evaluation of keyword measures for corpus-based discourse analysis.
Uhrig, P., Evert, S., & Proisl, T. (2018). Collocation Candidate Extraction from Dependency-Annotated Corpora: Exploring Differences across Parsers and Dependency Annotation Schemes. In Cantos-Gómez P, Almela-Sánchez M (Eds.), Lexical Collocation Analysis: Advances and Applications. (pp. 111–140). Cham: Springer International Publishing.
Peters, J., & Dykes, N. (2018). From keywords to discourse - towards a keyword operationalisation model in discourse linguistics. In Corpora and Discourse International Conference. Lancaster.
Büttner, A., Dimpel, F.M., Evert, S., Jannidis, F., Pielström, S., Proisl, T.,... Vitt, T. (2017). »Delta« in der stilometrischen Autorschaftsattribution. Zeitschrift für digitale Geisteswissenschaften. https://dx.doi.org/10.17175/2017_006
Evert, S., Heinrich, P., Henselmann, K., Rabenstein, U., Scherr, E., & Schröder, L. (2017). Combining Machine Learning and Semantic Features in the Classification of Corporate Disclosures. In Loukanova R, Liefke K (Eds.), Proceedings of the Workshop on Logic and Algorithms in Computational Linguistics 2017 (LACompLing2017) (pp. 47 - 62). Stockholm, SE: Stockholm: Stockholm University.
Proisl, T., Heinrich, P., Evert, S., & Kabashi, B. (2017). Translation Inference across Dictionaries via a Combination of Graph-based Methods and Co-occurrence Statistics. In McCrae J, Bond F, Buitelaar P, Cimiano P, Declerck T, Gracia J, Kernerman I, Ponsoda E, Ordan N, Piasecki M (Eds.), Proceedings of the LDK 2017 Workshops: 1st Workshop on the OntoLex Model (OntoLex-2017), Shared Task on Translation Inference Across Dictionaries & Challenges for Wordnets (pp. 94–102). Galway, IE: CEUR.
Evert, S., Proisl, T., Jannidis, F., Reger, I., Pielström, S., Schöch, C., & Vitt, T. (2017). Understanding and explaining Delta measures for authorship attribution. Digital Scholarship in the Humanities, 32(suppl_2), ii4–ii16. https://dx.doi.org/10.1093/llc/fqx023
Schäfer, F., Evert, S., & Heinrich, P. (2017). Japan's 2014 General Election: Political Bots, Right-Wing Internet Activism and PM Abe Shinzō’s Hidden Nationalist Agenda. Big Data, 5(4), 1 - 16.

Last updated on 2019-24-04 at 10:19