Machine learning based estimation of hoarseness severity from sustained vowels

Schraut T, Schützenberger A, Arias Vergara T, Kunduk M, Echternach M, Döllinger M (2022)

Publication Type: Journal article

Publication year: 2022


Original Authors: Tobias Schraut, Anne Schützenberger, Tomas Arias-Vergara, Melda Kunduk, Matthias Echternach, Michael Döllinger

Book Volume: 152

Pages Range: A141-A141

Issue: 4_Supplement

DOI: 10.1121/10.0015825


Acoustic assessment of voice impairment is commonly performed by subjective perceptual evaluation of continuous speech based on a grading system such as the roughness, breathiness, hoarseness (RBH) scale. Here, we present an automatic approach to objectively quantify hoarseness severity based on sustained vowels. For this study, a total of 635 recordings of the sustained vowel /a/ were collected. Temporal, spectral and cepstral features were extracted from one second of each recording. All recordings have an assigned RBH value, which was determined subjectively by an expert based on continuous speech of the respective subject. In order to account for the label noise introduced by different valuation bases, subjects were divided into two levels of hoarseness H < 2 and H ≥ 2. Logistic Regression was employed as classification model, using the resulting output probabilities as continuous severity rating. Relevant features were selected using a sequence of filter methods and backward elimination. The original feature set was reduced from 50 to 5 features. A classification accuracy of 86.73% was achieved on the test set. Detailed evaluation of output probabilities shows strong correlation (r = 0.81) with H values, allowing to capture and quantify individual improvement, deterioration or no change of hoarseness at different points in time, e.g. before and after voice therapy. The presented method describes a promising approach for objective evaluation of hoarseness, allowing to quantify treatment progress for patients with voice disorders.

Authors with CRIS profile

Involved external institutions

How to cite


Schraut, T., Schützenberger, A., Arias Vergara, T., Kunduk, M., Echternach, M., & Döllinger, M. (2022). Machine learning based estimation of hoarseness severity from sustained vowels. Journal of the Acoustical Society of America, 152, A141-A141.


Schraut, Tobias, et al. "Machine learning based estimation of hoarseness severity from sustained vowels." Journal of the Acoustical Society of America 152 (2022): A141-A141.

BibTeX: Download