Search | VHL Regional Portal

The Value of a Virtual Assistant to Improve Engagement in Computerized Cognitive Training at Home: Exploratory Study.

Zsoldos, Isabella; Trân, Eléonore; Fournier, Hippolyte; Tarpin-Bernard, Franck; Fruitet, Joan; Fouillen, Mélodie; Bailly, Gérard; Elisei, Frédéric; Bouchot, Béatrice; Constant, Patrick; Ringeval, Fabien; Koenig, Olivier; Chainay, Hanna.

JMIR Rehabil Assist Technol ; 11: e48129, 2024 Jun 20.

Article in English | MEDLINE | ID: mdl-38901017

ABSTRACT

BACKGROUND: Impaired cognitive function is observed in many pathologies, including neurodegenerative diseases such as Alzheimer disease. At present, the pharmaceutical treatments available to counter cognitive decline have only modest effects, with significant side effects. A nonpharmacological treatment that has received considerable attention is computerized cognitive training (CCT), which aims to maintain or improve cognitive functioning through repeated practice in standardized exercises. CCT allows for more regular and thorough training of cognitive functions directly at home, which represents a significant opportunity to prevent and fight cognitive decline. However, the presence of assistance during training seems to be an important parameter to improve patients' motivation and adherence to treatment. To compensate for the absence of a therapist during at-home CCT, a relevant option could be to include a virtual assistant to accompany patients throughout their training. OBJECTIVE: The objective of this exploratory study was to evaluate the interest of including a virtual assistant to accompany patients during CCT. We investigated the relationship between various individual factors (eg, age, psycho-affective functioning, personality, personal motivations, and cognitive skills) and the appreciation and usefulness of a virtual assistant during CCT. This study is part of the THERADIA (Thérapies Digitales Augmentées par l'Intelligence Artificielle) project, which aims to develop an empathetic virtual assistant. METHODS: A total of 104 participants were recruited, including 52 (50%) young adults (mean age 21.2, range 18 to 27, SD 2.9 years) and 52 (50%) older adults (mean age 67.9, range 60 to 79, SD 5.1 years). All participants were invited to the laboratory to answer several questionnaires and perform 1 CCT session, which consisted of 4 cognitive exercises supervised by a virtual assistant animated by a human pilot via the Wizard of Oz method. The participants evaluated the virtual assistant and CCT at the end of the session. RESULTS: Analyses were performed using the Bayesian framework. The results suggest that the virtual assistant was appreciated and perceived as useful during CCT in both age groups. However, older adults rated the assistant and CCT more positively overall than young adults. Certain characteristics of users, especially their current affective state (ie, arousal, intrinsic relevance, goal conduciveness, and anxiety state), appeared to be related to their evaluation of the session. CONCLUSIONS: This study provides, for the first time, insight into how young and older adults perceive a virtual assistant during CCT. The results suggest that such an assistant could have a beneficial influence on users' motivation, provided that it can handle different situations, particularly their emotional state. The next step of our project will be to evaluate our device with patients experiencing mild cognitive impairment and to test its effectiveness in long-term cognitive training.

SEWA DB: A Rich Database for Audio-Visual Emotion and Sentiment Research in the Wild.

Kossaifi, Jean; Walecki, Robert; Panagakis, Yannis; Shen, Jie; Schmitt, Maximilian; Ringeval, Fabien; Han, Jing; Pandit, Vedhas; Toisoul, Antoine; Schuller, Bjorn; Star, Kam; Hajiyev, Elnar; Pantic, Maja.

IEEE Trans Pattern Anal Mach Intell ; 43(3): 1022-1040, 2021 03.

Article in English | MEDLINE | ID: mdl-31581074

ABSTRACT

Natural human-computer interaction and audio-visual human behaviour sensing systems, which would achieve robust performance in-the-wild are more needed than ever as digital devices are increasingly becoming an indispensable part of our life. Accurately annotated real-world data are the crux in devising such systems. However, existing databases usually consider controlled settings, low demographic variability, and a single task. In this paper, we introduce the SEWA database of more than 2,000 minutes of audio-visual data of 398 people coming from six cultures, 50 percent female, and uniformly spanning the age range of 18 to 65 years old. Subjects were recorded in two different contexts: while watching adverts and while discussing adverts in a video chat. The database includes rich annotations of the recordings in terms of facial landmarks, facial action units (FAU), various vocalisations, mirroring, and continuously valued valence, arousal, liking, agreement, and prototypic examples of (dis)liking. This database aims to be an extremely valuable resource for researchers in affective computing and automatic human sensing and is expected to push forward the research in human behaviour analysis, including cultural studies. Along with the database, we provide extensive baseline experiments for automatic FAU detection and automatic valence, arousal, and (dis)liking intensity estimation.

Subject(s)

Algorithms , Emotions , Adolescent , Adult , Aged , Attitude , Databases, Factual , Face , Female , Humans , Middle Aged , Young Adult

An emotional modulation model as signature for the identification of children developmental disorders.

Mencattini, Arianna; Mosciano, Francesco; Comes, Maria Colomba; Di Gregorio, Tania; Raguso, Grazia; Daprati, Elena; Ringeval, Fabien; Schuller, Bjorn; Di Natale, Corrado; Martinelli, Eugenio.

Sci Rep ; 8(1): 14487, 2018 09 27.

Article in English | MEDLINE | ID: mdl-30262838

ABSTRACT

In recent years, applications like Apple's Siri or Microsoft's Cortana have created the illusion that one can actually "chat" with a machine. However, a perfectly natural human-machine interaction is far from real as none of these tools can empathize. This issue has raised an increasing interest in speech emotion recognition systems, as the possibility to detect the emotional state of the speaker. This possibility seems relevant to a broad number of domains, ranging from man-machine interfaces to those of diagnostics. With this in mind, in the present work, we explored the possibility of applying a precision approach to the development of a statistical learning algorithm aimed at classifying samples of speech produced by children with developmental disorders(DD) and typically developing(TD) children. Under the assumption that acoustic features of vocal production could not be efficiently used as a direct marker of DD, we propose to apply the Emotional Modulation function(EMF) concept, rather than running analyses on acoustic features per se to identify the different classes. The novel paradigm was applied to the French Child Pathological & Emotional Speech Database obtaining a final accuracy of 0.79, with maximum performance reached in recognizing language impairment (0.92) and autism disorder (0.82).

Subject(s)

Autistic Disorder/psychology , Databases, Factual , Developmental Disabilities/psychology , Emotions , Models, Psychological , Adolescent , Child , Child, Preschool , Female , Humans , Male

I Hear You Eat and Speak: Automatic Recognition of Eating Condition and Food Type, Use-Cases, and Impact on ASR Performance.

Hantke, Simone; Weninger, Felix; Kurle, Richard; Ringeval, Fabien; Batliner, Anton; Mousa, Amr El-Desoky; Schuller, Björn.

PLoS One ; 11(5): e0154486, 2016.

Article in English | MEDLINE | ID: mdl-27176486

ABSTRACT

We propose a new recognition task in the area of computational paralinguistics: automatic recognition of eating conditions in speech, i. e., whether people are eating while speaking, and what they are eating. To this end, we introduce the audio-visual iHEARu-EAT database featuring 1.6 k utterances of 30 subjects (mean age: 26.1 years, standard deviation: 2.66 years, gender balanced, German speakers), six types of food (Apple, Nectarine, Banana, Haribo Smurfs, Biscuit, and Crisps), and read as well as spontaneous speech, which is made publicly available for research purposes. We start with demonstrating that for automatic speech recognition (ASR), it pays off to know whether speakers are eating or not. We also propose automatic classification both by brute-forcing of low-level acoustic features as well as higher-level features related to intelligibility, obtained from an Automatic Speech Recogniser. Prediction of the eating condition was performed with a Support Vector Machine (SVM) classifier employed in a leave-one-speaker-out evaluation framework. Results show that the binary prediction of eating condition (i. e., eating or not eating) can be easily solved independently of the speaking condition; the obtained average recalls are all above 90%. Low-level acoustic features provide the best performance on spontaneous speech, which reaches up to 62.3% average recall for multi-way classification of the eating condition, i. e., discriminating the six types of food, as well as not eating. The early fusion of features related to intelligibility with the brute-forced acoustic feature set improves the performance on read speech, reaching a 66.4% average recall for the multi-way classification task. Analysing features and classifier errors leads to a suitable ordinal scale for eating conditions, on which automatic regression can be performed with up to 56.2% determination coefficient.

Subject(s)

Eating/physiology , Food , Hearing/physiology , Speech Recognition Software , Speech/physiology , Adult , Audiovisual Aids , Automation , Databases as Topic , Female , Humans , Male , Regression Analysis , Self Report , Support Vector Machine

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL