Machine Learning for Stuttering Detection and Classification in Stuttering Therapy

Bayerl S (2024)

Publication Language: English

Publication Type: Thesis

Publication year: 2024

URI: https://open.fau.de/handle/openfau/31279

Abstract

This thesis comprehensively explores stuttering detection and classification, lever- aging diverse datasets and methodologies. It emphasizes the importance of stuttering classification in enhancing accessibility for individuals with speech disorders and delivers a structured overview, starting with the foundational elements of speech and machine learning and extending to the new methods developed during the course of this research. The thesis identifies existing research gaps, necessitating a thorough examination of available datasets and a review of historical methods deployed for stuttering detection. It highlights the absence of comparability in the field due to diversity in annotation methods and the scarcity of datasets. This work is a crucial step in advancing the automation of stuttering assessment and therapy evaluation. It introduces and evaluates the speech control index, a new metric to assess stuttering therapy recordings. It goes on to create the Kassel State of Fluency (KSoF) dataset, advancing German stuttering research with optimized annotation protocols and enabling cross-language research initiatives. An in-depth analysis of the Stuttering Events in Podcasts (SEP-28k) brings forth valuable insights into its composition, advocating for quality assurance in research data and speaker exclusivity while splitting data into train, development, and test sets. This thesis investigates speech transformer features for stuttering detection and sets new benchmarks in stuttering classification. It conclusively shows the utility of features learned using English stuttering data on German stuttering recordings. Furthermore, it evaluates end-to-end stuttering classification systems based on speech transformer models. This is done using multi-language and cross-corpus datasets, showing the developed methods’ generalizability and their contribution towards a general stuttering detection system. It does so by evaluating cross-corpus and multi-language stuttering classification systems. While the research concludes with the harder task of multi-label stuttering classification, it underscores the ongoing challenges due to data availability and the inherent ambiguity of stuttering. It advocates for the creation of new multimodal datasets and refinement in the processing of stuttered speech, focusing on improving the accessibility of speech technology. The thesis contains innovative contributions, empirical explorations, and insights, aiming to navigate the future pathway in the field of stuttering detection and classification.

Authors with CRIS profile

Sebastian Bayerl Professur für Informatik (Mustererkennung)

How to cite

APA:

Bayerl, S. (2024). Machine Learning for Stuttering Detection and Classification in Stuttering Therapy (Dissertation).

MLA:

Bayerl, Sebastian. Machine Learning for Stuttering Detection and Classification in Stuttering Therapy. Dissertation, 2024.

BibTeX: Download