Download PDFOpen PDF in browser

Whispers of Sound:Enhancing Information Extraction from Depression Patients' Unstructured Data Through Audio and Text Emotion Recognition and Llama Fine-tuning

EasyChair Preprint 14991

18 pagesDate: September 21, 2024

Abstract

Mental health issues present significant global challenges, affecting over 20% of adults at some point in their lives. While large language models have shown promise in various fields, their application in mental health remains underexplored. This study assesses how effectively these models can be applied to mental health, using the DAIC-WOZ text datasets and RAVDESS audio datasets. Given the challenges of missing non-verbal cues and ambiguous terms in text data, audio data was incorporated during training to address these gaps. This integration enhanced the models' ability to comprehend, extract, and summarize complex information, particularly in depression assessments. Additionally, technical optimizations, such as increasing the model's max_length to 8192, reduced GPU memory usage by 40%-50% and improved context processing, leading to substantial gains in handling complex mental health data.

Keyphrases: Depression, Llama, audio, fine-tuning, mental health, text

BibTeX entry
BibTeX does not have the right entry for preprints. This is a hack for producing the correct reference:
@booklet{EasyChair:14991,
  author    = {Lin Gan and Xiaoyang Gao and Yifan Huang and Tao Yang},
  title     = {Whispers of Sound:Enhancing Information Extraction from Depression Patients' Unstructured Data Through Audio and Text Emotion Recognition and Llama Fine-tuning},
  howpublished = {EasyChair Preprint 14991},
  year      = {EasyChair, 2024}}
Download PDFOpen PDF in browser