Automatic detection of Voice Onset Time in voiceless plosives using gated recurrent units

Arias Vergara T, Arguello-Velez P, Vasquez Correa J, Nöth E, Schuster M, González-Rátiva MC, Orozco Arroyave JR (2020)


Publication Type: Journal article

Publication year: 2020

Journal

Book Volume: 104

Article Number: 102779

DOI: 10.1016/j.dsp.2020.102779

Abstract

Voice Onset Time (VOT) has been used by researchers as an acoustic measure in order to gain some understanding about the impact of different motor speech disorders in speech production. However, VOT values are usually obtained manually, which is expensive and time consuming. In this paper we proposed a method for the automatic detection of VOT based on pre-trained Recurrent Neural Networks with Gated Recurrent Units (GRUs). Speech recordings from 50 Spanish native speakers from Colombia (25 male) are considered for the experiments. The recordings include the utterance of the diadochokinesis task /pa-ta-ka/ which is typically used for the evaluation of motor speech disorders like those caused due to Parkinson's disease. Additionally, the diadochokinesis task allows us to train a system to detect the VOT of voiceless plosive sounds in intermediate positions. Acoustic analysis is performed by extracting different temporal and spectral features from the recordings. According to the results, it is possible to detect the VOT with F1-score values of 0.66 for [Formula presented], 0.75 for [Formula presented], and 0.78 for [Formula presented] when the predicted values are compared with respect to the manual VOT labels.

Authors with CRIS profile

Involved external institutions

How to cite

APA:

Arias Vergara, T., Arguello-Velez, P., Vasquez Correa, J., Nöth, E., Schuster, M., González-Rátiva, M.C., & Orozco Arroyave, J.R. (2020). Automatic detection of Voice Onset Time in voiceless plosives using gated recurrent units. Digital Signal Processing, 104. https://dx.doi.org/10.1016/j.dsp.2020.102779

MLA:

Arias Vergara, Tomás, et al. "Automatic detection of Voice Onset Time in voiceless plosives using gated recurrent units." Digital Signal Processing 104 (2020).

BibTeX: Download