Stylistic Features in Corporate Disclosures and their Predictive Power

Heinrich P (2018)


Publication Language: English

Publication Type: Conference contribution, Conference Contribution

Publication year: 2018

Pages Range: 129 - 134

Conference Proceedings Title: Proceedings of 4th Asia Pacific Corpus Linguistics Conference (APCLC2018)

Event location: Takamatsu JP

Abstract

We are concerned with the automatic processing of annual reports submitted to the U.S. SEC’s EDGAR filing system. The filings consist of structured as well as unstructured information. One part of the filings, the 10-k forms, contains mostly linguistic data segmented into up to 20 items. We briefly describe what steps have to be taken to extract the relevant linguistic information from the unstructured part of the data. We then present results of a first exploratory corpus analysis and provide descriptive statistical figures for our NLP calculations (sentiment, readability, and further stylistic dimensions) for each item of the 10-k form and point out connections between the semantic content of the analyzed items and the quantitative linguistic observables. The linguistic register both varies across items as well as subject to the standard industrial classification of the company. We conclude by applying a dimensionality reduction algorithm (t-SNE) to the linguistic observables and use the embedding for a qualitative comparison with the company’s industry.

Authors with CRIS profile

How to cite

APA:

Heinrich, P. (2018). Stylistic Features in Corporate Disclosures and their Predictive Power. In Yukio Tono & Hitoshi Isahara (Eds.), Proceedings of 4th Asia Pacific Corpus Linguistics Conference (APCLC2018) (pp. 129 - 134). Takamatsu, JP.

MLA:

Heinrich, Philipp. "Stylistic Features in Corporate Disclosures and their Predictive Power." Proceedings of the 4th Asia Pacific Corpus Linguistics Conference (APCLC2018), Takamatsu Ed. Yukio Tono & Hitoshi Isahara, 2018. 129 - 134.

BibTeX: Download