Understanding and explaining Delta measures for authorship attribution

Evert S, Proisl T, Jannidis F, Reger I, Pielström S, Schöch C, Vitt T (2017)


Publication Language: English

Publication Type: Journal article, Original article

Publication year: 2017

Journal

Book Volume: 32

Pages Range: ii4–ii16

Journal Issue: suppl_2

DOI: 10.1093/llc/fqx023

Open Access Link: https://doi.org/10.1093/llc/fqx023

Abstract

This article builds on a mathematical explanation of one the most prominent stylometric measures, Burrows’s Delta (and its variants), to understand and explain its working. Starting with the conceptual separation between feature selection, feature scaling, and distance measures, we have designed a series of controlled experiments in which we used the kind of feature scaling (various types of standardization and normalization) and the type of distance measures (notably Manhattan, Euclidean, and Cosine) as independent variables and the correct authorship attributions as the dependent variable indicative of the performance of each of the methods proposed. In this way, we are able to describe in some detail how each of these two variables interact with each other and how they influence the results. Thus we can show that feature vector normalization, that is, the transformation of the feature vectors to a uniform length of 1 (implicit in the cosine measure), is the decisive factor for the improvement of Delta proposed recently. We are also able to show that the information particularly relevant to the identification of the author of a text lies in the profile of deviation across the most frequent words rather than in the extent of the deviation or in the deviation of specific words only.

Authors with CRIS profile

How to cite

APA:

Evert, S., Proisl, T., Jannidis, F., Reger, I., Pielström, S., Schöch, C., & Vitt, T. (2017). Understanding and explaining Delta measures for authorship attribution. Digital Scholarship in the Humanities, 32(suppl_2), ii4–ii16. https://doi.org/10.1093/llc/fqx023

MLA:

Evert, Stephanie, et al. "Understanding and explaining Delta measures for authorship attribution." Digital Scholarship in the Humanities 32.suppl_2 (2017): ii4–ii16.

BibTeX: Download