Multi-Modal Biomarker Extraction Framework for Therapy Monitoring of Social Anxiety and Depression Using Audio and Video

Weise T, Perez Toro PA, Deitermann A, Hoffmann B, Demir K, Straetz T, Nöth E, Maier A, Kallert T, Yang SH (2023)


Publication Language: English

Publication Type: Conference contribution, Conference Contribution

Publication year: 2023

Publisher: Springer

City/Town: Cham

Pages Range: 26-42

Conference Proceedings Title: International Conference on Machine Learning (Workshop on Machine Learning for Multimodal Healthcare)

Event location: Hawaii Convention Center, 1801 Kalākaua Ave, Honolulu, HI 96815, United States US

ISBN: 9783031476785

DOI: 10.1007/978-3-031-47679-2_3

Abstract

This paper introduces a framework that can be used for feature extraction, relevant to monitoring the speech therapy progress of individuals suffering from social anxiety or depression. It operates multi-modal (decision fusion) by incorporating audio and video recordings of a patient and the corresponding interviewer, at two separate test assessment sessions. The used data is provided by an ongoing project in a day-hospital and outpatient setting in Germany, with the goal of investigating whether an established speech therapy group program for adolescents, which is implemented in a stationary and semi-stationary setting, can be successfully carried out via telemedicine. The features proposed in this multi-modal approach could form the basis for interpretation and analysis by medical experts and therapists, in addition to acquired data in the form of questionnaires. Extracted audio features focus on prosody (intonation, stress, rhythm, and timing), as well as predictions from a deep neural network model, which is inspired by the Pleasure, Arousal, Dominance (PAD) emotional model space. Video features are based on a pipeline that is designed to enable visualization of the interaction between the patient and the interviewer in terms of Facial Emotion Recognition (FER), utilizing the mini-Xception network architecture.

Authors with CRIS profile

How to cite

APA:

Weise, T., Perez Toro, P.A., Deitermann, A., Hoffmann, B., Demir, K., Straetz, T.,... Yang, S.H. (2023). Multi-Modal Biomarker Extraction Framework for Therapy Monitoring of Social Anxiety and Depression Using Audio and Video. In Andreas K. Maier, Julia A. Schnabel, Pallavi Tiwari, Oliver Stegle (Eds.), International Conference on Machine Learning (Workshop on Machine Learning for Multimodal Healthcare) (pp. 26-42). Hawaii Convention Center, 1801 Kalākaua Ave, Honolulu, HI 96815, United States, US: Cham: Springer.

MLA:

Weise, Tobias, et al. "Multi-Modal Biomarker Extraction Framework for Therapy Monitoring of Social Anxiety and Depression Using Audio and Video." Proceedings of the International Conference on Machine Learning (Workshop on Machine Learning for Multimodal Healthcare), Hawaii Convention Center, 1801 Kalākaua Ave, Honolulu, HI 96815, United States Ed. Andreas K. Maier, Julia A. Schnabel, Pallavi Tiwari, Oliver Stegle, Cham: Springer, 2023. 26-42.

BibTeX: Download