Parallel Representation Learning for the Classification of Pathological Speech: Studies on Parkinson's Disease and Cleft Lip and Palate

Vasquez Correa J, Arias Vergara T, Schuster M, Orozco Arroyave JR, Nöth E (2020)


Publication Type: Journal article

Publication year: 2020

Journal

Book Volume: 122

Pages Range: 56-67

DOI: 10.1016/j.specom.2020.07.005

Abstract

Speech signals may contain different paralinguistic aspects such as the presence of pathologies that affect the proper communication capabilities of a speaker. Those speech disorders have different origin depending on the type of the disease. For instance, diseases with morphological origin such as cleft lip and palate that causes hypernasality, or with neurodegenerative origin such as Parkinson's disease that generates hypokinetic dysarthria on the patients. Automatic assessment of pathological speech allows to support the diagnosis and/or the evaluation of the disease severity. Conventional methods are based on the manually applied assessment of single features such as jitter, shimmer, or formant frequencies that may not completely model all of the phenomena that appear due to the disease. This paper introduces a novel strategy based on unsupervised representation learning for automatic detection of pathological speech. The proposed approach is based on the use of recurrent and convolutional autoencoders trained to extract informative features to characterize the presence of pathologies in speech. A novel feature set based on the reconstruction error of the autoencoders is also proposed. The performance of the introduced models is evaluated classifying pathological speech signals recorded from people suffering from Parkinson's disease, and children with cleft lip and palate. All participants from this study were Spanish native speakers. The proposed models are accurate to classify the speech signals of both kinds of diseases, with an accuracy of up to 97% for cleft lip and palate, and up to 84% for the case of Parkinson's disease. We also show that the reconstruction error from the autoencoders in different frequency regions contain information related to specific speech symptoms of both diseases.

Authors with CRIS profile

Involved external institutions

How to cite

APA:

Vasquez Correa, J., Arias Vergara, T., Schuster, M., Orozco Arroyave, J.R., & Nöth, E. (2020). Parallel Representation Learning for the Classification of Pathological Speech: Studies on Parkinson's Disease and Cleft Lip and Palate. Speech Communication, 122, 56-67. https://dx.doi.org/10.1016/j.specom.2020.07.005

MLA:

Vasquez Correa, Juan, et al. "Parallel Representation Learning for the Classification of Pathological Speech: Studies on Parkinson's Disease and Cleft Lip and Palate." Speech Communication 122 (2020): 56-67.

BibTeX: Download