Harnessing multimodal approaches for depression detection using large language models and facial expressions

Sadeghi M, Richer R, Egger B, Schindler-Gmelch L, Rupp L, Rahimi F, Berking M, Eskofier B (2024)


Publication Language: English

Publication Type: Journal article, Original article

Publication year: 2024

Journal

Book Volume: 3

Pages Range: 66

Journal Issue: 1

URI: https://www.nature.com/articles/s44184-024-00112-8

DOI: 10.1038/s44184-024-00112-8

Open Access Link: https://www.nature.com/articles/s44184-024-00112-8

Abstract

Detecting depression is a critical component of mental health diagnosis, and accurate assessment is essential for effective treatment. This study introduces a novel, fully automated approach to predicting depression severity using the E-DAIC dataset. We employ Large Language Models (LLMs) to extract depression-related indicators from interview transcripts, utilizing the Patient Health Questionnaire-8 (PHQ-8) score to train the prediction model. Additionally, facial data extracted from video frames is integrated with textual data to create a multimodal model for depression severity prediction. We evaluate three approaches: text-based features, facial features, and a combination of both. Our findings show the best results are achieved by enhancing text data with speech quality assessment, with a mean absolute error of 2.85 and root mean square error of 4.02. This study underscores the potential of automated depression detection, showing text-only models as robust and effective while paving the way for multimodal analysis.

Authors with CRIS profile

Related research project(s)

How to cite

APA:

Sadeghi, M., Richer, R., Egger, B., Schindler-Gmelch, L., Rupp, L., Rahimi, F.,... Eskofier, B. (2024). Harnessing multimodal approaches for depression detection using large language models and facial expressions. npj Mental Health Research, 3(1), 66. https://doi.org/10.1038/s44184-024-00112-8

MLA:

Sadeghi, Misha, et al. "Harnessing multimodal approaches for depression detection using large language models and facial expressions." npj Mental Health Research 3.1 (2024): 66.

BibTeX: Download