Multimodal transformer for depression detection based on EEG and interview data

Research output: Contribution to journalArticleAcademicpeer-review

1 Downloads (Pure)

Abstract

Depression detection benefits from combining neurological and behavioral indicators, yet integrating heterogeneous modalities such as EEG and interview audio remains challenging. We propose a transformer-based multimodal framework that jointly models spectral, spatial, and temporal EEG features alongside linguistic and paralinguistic cues from interviews. By employing synchronized multi-head cross-attention and self-attention mechanisms, the model effectively captures intra- and inter-modal correlations. In addition, a flexible temporal sequence matching strategy reduces EEG channel requirements, enhancing device portability. Evaluated on the MODMA and DAIC-WOZ datasets, our approach achieves superior performance compared to state-of-the-art models, with a 4.7% improvement in accuracy and a 10% increase in precision. These results demonstrate the potential of the proposed framework for accurate, scalable, and cost-effective depression detection in both clinical and real-world settings.
Original languageEnglish
Article number109039
Number of pages11
JournalBiomedical signal processing and control
Volume113
Issue numberB
Early online date5-Nov-2025
DOIs
Publication statusE-pub ahead of print - 5-Nov-2025

Keywords

  • Depression detection
  • EEG
  • Flexible temporal sequence matching
  • Modality synchronization
  • Multimodal transformer

Fingerprint

Dive into the research topics of 'Multimodal transformer for depression detection based on EEG and interview data'. Together they form a unique fingerprint.

Cite this