Machine learning based estimation of hoarseness severity using sustained vowelsa)

Schraut T, Schützenberger A, Arias Vergara T, Kunduk M, Echternach M, Döllinger M (2024)

Publication Language: English

Publication Type: Journal article

Publication year: 2024

Journal

Journal of the Acoustical Society of America Acoustical Society of America

Book Volume: 155

Pages Range: 381-395

Journal Issue: 1

DOI: 10.1121/10.0024341

Abstract

Auditory perceptual evaluation is considered the gold standard for assessing voice quality, but its reliability is limited due to inter-rater variability and coarse rating scales. This study investigates a continuous, objective approach to evaluate hoarseness severity combining machine learning (ML) and sustained phonation. For this purpose, 635 acoustic recordings of the sustained vowel /a/ and subjective ratings based on the roughness, breathiness, and hoarseness scale were collected from 595 subjects. A total of 50 temporal, spectral, and cepstral features were extracted from each recording and used to identify suitable ML algorithms. Using variance and correlation analysis followed by backward elimination, a subset of relevant features was selected. Recordings were classified into two levels of hoarseness, H < 2 and H ≥ 2 , yielding a continuous probability score y ̂ ∈ [ 0 , 1 ] . An accuracy of 0.867 and a correlation of 0.805 between the model's predictions and subjective ratings was obtained using only five acoustic features and logistic regression (LR). Further examination of recordings pre- and post-treatment revealed high qualitative agreement with the change in subjectively determined hoarseness levels. Quantitatively, a moderate correlation of 0.567 was obtained. This quantitative approach to hoarseness severity estimation shows promising results and potential for improving the assessment of voice quality.

Authors with CRIS profile

Tobias Schraut Lehrstuhl für Hals-Nasen-Ohrenheilkunde Anne Schützenberger Professur für Phoniatrie und Pädaudiologie Tomás Arias Vergara Lehrstuhl für Informatik 5 (Mustererkennung) Michael Döllinger Professur für Computational Medicine

Involved external institutions

Louisiana State University

United States (USA) (US) Universitätsklinikum der Ludwig-Maximilians-Universität München

Germany (DE)

How to cite

APA:

Schraut, T., Schützenberger, A., Arias Vergara, T., Kunduk, M., Echternach, M., & Döllinger, M. (2024). Machine learning based estimation of hoarseness severity using sustained vowelsa). Journal of the Acoustical Society of America, 155(1), 381-395. https://doi.org/10.1121/10.0024341

MLA:

Schraut, Tobias, et al. "Machine learning based estimation of hoarseness severity using sustained vowelsa)." Journal of the Acoustical Society of America 155.1 (2024): 381-395.

BibTeX: Download