CONF VILLATORO-TELLO_INTERSPEECH2021_2021/IDIAP Late Fusion of the Available Lexicon and Raw Waveform-based Acoustic Modeling for Depression and Dementia Recognition Villatoro-Tello, Esaú Dubagunta, S. Pavankumar Fritsch, Julian Ramírez-de-la-Rosa, Gabriela Motlicek, Petr Magimai-Doss, Mathew EXTERNAL https://publications.idiap.ch/attachments/papers/2021/VILLATORO-TELLO_INTERSPEECH2021_2021.pdf PUBLIC https://publications.idiap.ch/index.php/publications/showcite/VILLATORO-TELLO_Idiap-RR-09-2021 Related documents Proceedings of Interspeech 2021 2021 ISCA-International Speech Communication Association 2021 Mental disorders, e.g. depression and dementia, are categorized as priority conditions according to the World Health Organization (WHO). When diagnosing, psychologists employ structured questionnaires/interviews, and different cognitive tests. Although accurate, there is an increasing necessity of developing digital mental health support technologies to alleviate the burden faced by professionals. In this paper, we propose a multi-modal approach for modeling the communication process employed by patients being part of a clinical interview or a cognitive test. The language-based modality, inspired by the Lexical Availability (LA) theory from psycho-linguistics, identifies the most accessible vocabulary of the interviewed subject and use it as features in a classification process. The acoustic-based modality is processed by a Convolutional Neural Network (CNN) trained on signals of speech that predominantly contained voice source characteristics. In the end, a late fusion technique, based on majority voting, assigns the final classification. Results show the complementarity of both modalities, reaching an overall Macro-F1 of 84% and 90% for Depression and Alzheimer's dementia respectively. REPORT VILLATORO-TELLO_Idiap-RR-09-2021/IDIAP Late Fusion of the Available Lexicon and Raw Waveform-based Acoustic Modeling for Depression and Dementia Recognition Villatoro-Tello, Esaú Dubagunta, S. Pavankumar Fritsch, Julian Ramírez-de-la-Rosa, Gabriela Motlicek, Petr Magimai-Doss, Mathew Alzheimer's disease depression detection Mental Lexicon Multi-modal Approach Raw Speech EXTERNAL https://publications.idiap.ch/attachments/reports/2021/VILLATORO-TELLO_Idiap-RR-09-2021.pdf PUBLIC Idiap-RR-09-2021 2021 Idiap July 2021 Paper accepted for Publication in Interspeech 2021 Mental disorders, e.g. depression and dementia, are categorized as priority conditions according to the World Health Organization (WHO). When diagnosing, psychologists employ structured questionnaires/interviews, and different cognitive tests. Although accurate, there is an increasing necessity of developing digital mental health support technologies to alleviate the burden faced by professionals. In this paper, we propose a multi-modal approach for modeling the communication process employed by patients being part of a clinical interview or a cognitive test. The language-based modality, inspired by the Lexical Availability (LA) theory from psycho-linguistics, identifies the most accessible vocabulary of the interviewed subject and use it as features in a classification process. The acoustic-based modality is processed by a Convolutional Neural Network (CNN) trained on signals of speech that predominantly contained voice source characteristics. At the end, a late fusion technique, based on majority voting, assigns the final classification. Results show the complementarity of both modalities, reaching an overall Macro-F1 of 84% and 90% for Depression and Alzheimer's dementia respectively.