A Generative Method for a Laryngeal Biosignal

Darvish M, Kist A (2024)


Publication Type: Journal article

Publication year: 2024

Journal

DOI: 10.1016/j.jvoice.2024.01.016

Abstract

The Glottal Area Waveform (GAW) is an important component in quantitative clinical voice assessment, providing valuable insights into vocal fold function. In this study, we introduce a novel method employing Variational Autoencoders (VAEs) to generate synthetic GAWs. Our approach enables the creation of synthetic GAWs that closely replicate real-world data, offering a versatile tool for researchers and clinicians. We elucidate the process of manipulating the VAE latent space using the Glottal Opening Vector (GlOVe). The GlOVe allows precise control over the synthetic closure and opening of the vocal folds. By utilizing the GlOVe, we generate synthetic laryngeal biosignals. These biosignals accurately reflect vocal fold behavior, allowing for the emulation of realistic glottal opening changes. This manipulation extends to the introduction of arbitrary oscillations in the vocal folds, closely resembling real vocal fold oscillations. The range of factor coefficient values enables the generation of diverse biosignals with varying frequencies and amplitudes. Our results demonstrate that this approach yields highly accurate laryngeal biosignals, with the Normalized Mean Absolute Error values for various frequencies ranging from 9.6 ⋅ 10−3 to 1.20 ⋅ 10−2 for different experimented frequencies, alongside a remarkable training effectiveness, reflected in reductions of up to approximately 89.52% in key loss components. This proposed method may have implications for downstream speech synthesis and phonetics research, offering the potential for advanced and natural-sounding speech technologies.

Authors with CRIS profile

How to cite

APA:

Darvish, M., & Kist, A. (2024). A Generative Method for a Laryngeal Biosignal. Journal of Voice. https://dx.doi.org/10.1016/j.jvoice.2024.01.016

MLA:

Darvish, Mahdi, and Andreas Kist. "A Generative Method for a Laryngeal Biosignal." Journal of Voice (2024).

BibTeX: Download