Non-FAU Project
Start date : 08.04.2024
In the text linguistic tradition, registers are culturally recognized varieties of texts, associated with the communicative situation of use (emails, textbooks, conversations, etc.). The situation of use in turn calls for particular linguistic characteristics. Linguistic features frequent in a register are thus not arbitrary but are a direct response to the situation of this register and perform communicative functions necessitated by the situation (Biber & Conrad, 2019). At the same time, the texts of any given register are not homogenous in their linguistic or situational characteristics (e.g., Biber et al., 2020; Egbert & Gracheva, 2023; Wood, 2023; Goulart, 2024): As the situations in which texts of a register are created vary, so do the linguistic features that are frequent in those texts. (Biber & Egbert, 2023).
These functional links between situation and language raise new questions about registers’ evolution as cultural constructs. First, as communicative situations evolve, language must reflect language users’ adaptations to the new situations of use. Second, varying degrees of register-internal variability at different points of registers’ existence could reflect language users’ degrees of convergence (or lack thereof) on certain communicative and linguistic register norms.
This study begins investigating these questions focusing on blogs—a relatively new register but characterized by a rapidly evolving technological landscape (Miller & Shepherd, 2004, 2009) and whose full life cycle is available for scientific study—from its inception in the late 90s to the present day (corpus under analysis: Ntexts = 2,452; Ntokens = ~4,000,000; source: Blogspot.com; years: 1999–2023, approx. 100 texts/year). We address the following research questions (RQ):
1. What linguistic features of blogs have become more or less frequent over time?
2. How does the linguistic stability of blogs—the degree of variation in the linguistic characteristics—change over time?
To address RQ 1, we compute rates of occurrence for a wide range of lexico-grammatical features in each text (the corpus is tagged for 150 linguistic features with an updated version of the Biber tagger; Biber, 1988) and employ corresponding feature analysis (Egbert, 2024) to examine correlations between rates of occurrence of linguistic features and text dates. The analysis revealed that features associated with oral/interactive communication and clausal elaboration (e.g., pronominal references, various verb types) have increased over time, while features of information density (e.g., nouns) and abstractedness (e.g., passive constructions) have decreased. Blogs have also become increasingly present-oriented, and features of narrativity (e.g., past tense) have shown a downward trend.
In response to RQ 2, we present analyses of coefficients of variation (Segalowitz & Segalowitz, 1993) as a way of examining linguistic stability and show that variation in the majority of blog features has decreased, pointing to an increased stability in the linguistic norms of the register.