On the Suitability of Data Augmentation Techniques to Improve Parkinson’s Disease Detection with Speech Recordings

Ríos-Urrego CD, Ruiz-Romero TA, Puerta-Lotero D, Escobar-Grisales D, Orozco-Arroyave JR (2026)


Publication Type: Journal article

Publication year: 2026

Journal

Book Volume: 16

Article Number: 498

Journal Issue: 3

DOI: 10.3390/diagnostics16030498

Abstract

Background: Parkinson’s disease (PD) is a neurodegenerative disorder that affects millions of people worldwide. Speech analysis has emerged as a non-invasive tool for automatic PD detection; however, the scarcity and homogeneity of available datasets often limit the generalization capability of machine learning models, motivating the use of data augmentation strategies to improve robustness. Methods: This study presents a data augmentation-based methodology for speech-based classification between PD patients and healthy control subjects. A deep learning model trained from scratch on Mel spectrograms is evaluated using augmentation techniques applied at both the waveform and time–frequency levels. Multiple training and model selection strategies are analyzed and model performance is assessed through internal validation as well as using an independent dataset Results: Experimental results show that carefully selected data augmentation techniques improve classification performance with respect to the non-augmented counterpart, achieving gains of up to 3% in accuracy. However, when evaluated on an independent dataset, these improvements do not consistently translate into better generalization. Conclusions: These findings demonstrate that, while data augmentation can effectively enhance model performance within a single dataset, this apparent robustness is not sufficient to guarantee generalization on independent speech corpora for PD detection.

Involved external institutions

How to cite

APA:

Ríos-Urrego, C.D., Ruiz-Romero, T.A., Puerta-Lotero, D., Escobar-Grisales, D., & Orozco-Arroyave, J.R. (2026). On the Suitability of Data Augmentation Techniques to Improve Parkinson’s Disease Detection with Speech Recordings. Diagnostics, 16(3). https://doi.org/10.3390/diagnostics16030498

MLA:

Ríos-Urrego, Cristian David, et al. "On the Suitability of Data Augmentation Techniques to Improve Parkinson’s Disease Detection with Speech Recordings." Diagnostics 16.3 (2026).

BibTeX: Download