Pesquisa | Portal Regional da BVS

1.

Machine Learning Approaches for Dementia Detection Through Speech and Gait Analysis: A Systematic Literature Review.

Al-Hammadi, Mustafa; Fleyeh, Hasan; Åberg, Anna Cristina; Halvorsen, Kjartan; Thomas, Ilias.

J Alzheimers Dis ; 100(1): 1-27, 2024.

Artigo em Inglês | MEDLINE | ID: mdl-38848181

RESUMO

Background: Dementia is a general term for several progressive neurodegenerative disorders including Alzheimer's disease. Timely and accurate detection is crucial for early intervention. Advancements in artificial intelligence present significant potential for using machine learning to aid in early detection. Objective: Summarize the state-of-the-art machine learning-based approaches for dementia prediction, focusing on non-invasive methods, as the burden on the patients is lower. Specifically, the analysis of gait and speech performance can offer insights into cognitive health through clinically cost-effective screening methods. Methods: A systematic literature review was conducted following the PRISMA protocol (Preferred Reporting Items for Systematic Reviews and Meta-Analyses). The search was performed on three electronic databases (Scopus, Web of Science, and PubMed) to identify the relevant studies published between 2017 to 2022. A total of 40 papers were selected for review. Results: The most common machine learning methods employed were support vector machine followed by deep learning. Studies suggested the use of multimodal approaches as they can provide comprehensive and better prediction performance. Deep learning application in gait studies is still in the early stages as few studies have applied it. Moreover, including features of whole body movement contribute to better classification accuracy. Regarding speech studies, the combination of different parameters (acoustic, linguistic, cognitive testing) produced better results. Conclusions: The review highlights the potential of machine learning, particularly non-invasive approaches, in the early prediction of dementia. The comparable prediction accuracies of manual and automatic speech analysis indicate an imminent fully automated approach for dementia detection.

Assuntos

Demência , Aprendizado de Máquina , Fala , Humanos , Demência/diagnóstico , Fala/fisiologia , Análise da Marcha/métodos

2.

Automatic classification of AD pathology in FTD phenotypes using natural speech.

Cho, Sunghye; Olm, Christopher A; Ash, Sharon; Shellikeri, Sanjana; Agmon, Galit; Cousins, Katheryn A Q; Irwin, David J; Grossman, Murray; Liberman, Mark; Nevler, Naomi.

Alzheimers Dement ; 20(5): 3416-3428, 2024 May.

Artigo em Inglês | MEDLINE | ID: mdl-38572850

RESUMO

INTRODUCTION: Screening for Alzheimer's disease neuropathologic change (ADNC) in individuals with atypical presentations is challenging but essential for clinical management. We trained automatic speech-based classifiers to distinguish frontotemporal dementia (FTD) patients with ADNC from those with frontotemporal lobar degeneration (FTLD). METHODS: We trained automatic classifiers with 99 speech features from 1 minute speech samples of 179 participants (ADNC = 36, FTLD = 60, healthy controls [HC] = 89). Patients' pathology was assigned based on autopsy or cerebrospinal fluid analytes. Structural network-based magnetic resonance imaging analyses identified anatomical correlates of distinct speech features. RESULTS: Our classifier showed 0.88 ± $ \pm $ 0.03 area under the curve (AUC) for ADNC versus FTLD and 0.93 ± $ \pm $ 0.04 AUC for patients versus HC. Noun frequency and pause rate correlated with gray matter volume loss in the limbic and salience networks, respectively. DISCUSSION: Brief naturalistic speech samples can be used for screening FTD patients for underlying ADNC in vivo. This work supports the future development of digital assessment tools for FTD. HIGHLIGHTS: We trained machine learning classifiers for frontotemporal dementia patients using natural speech. We grouped participants by neuropathological diagnosis (autopsy) or cerebrospinal fluid biomarkers. Classifiers well distinguished underlying pathology (Alzheimer's disease vs. frontotemporal lobar degeneration) in patients. We identified important features through an explainable artificial intelligence approach. This work lays the groundwork for a speech-based neuropathology screening tool.

Assuntos

Doença de Alzheimer , Demência Frontotemporal , Imageamento por Ressonância Magnética , Fala , Humanos , Feminino , Doença de Alzheimer/patologia , Masculino , Idoso , Demência Frontotemporal/patologia , Fala/fisiologia , Pessoa de Meia-Idade , Fenótipo , Degeneração Lobar Frontotemporal/patologia , Aprendizado de Máquina

3.

Editorial: Data-driven clinical biosignatures and treatment for neurodegenerative diseases, volume II.

Wang, Nizhuan; Chen, Lei; Kong, Wei; Hsu, Chung Y; Tzeng, I-Shiang.

Front Neurosci ; 18: 1396702, 2024.

Artigo em Inglês | MEDLINE | ID: mdl-38562302

4.

Using Objective Speech Analysis Techniques for the Clinical Diagnosis and Assessment of Speech Disorders in Patients with Multiple Sclerosis.

Sonkaya, Zeynep Z; Özturk, Bilgin; Sonkaya, Riza; Taskiran, Esra; Karadas, Ömer.

Brain Sci ; 14(4)2024 Apr 16.

Artigo em Inglês | MEDLINE | ID: mdl-38672033

RESUMO

Multiple sclerosis (MS) is one of the chronic and neurodegenerative diseases of the central nervous system (CNS). It generally affects motor, sensory, cerebellar, cognitive, and language functions. It is thought that identifying MS speech disorders using quantitative methods will make a significant contribution to physicians in the diagnosis and follow-up of MS patients. In this study, it was aimed to investigate the speech disorders of MS via objective speech analysis techniques. The study was conducted on 20 patients diagnosed with MS according to McDonald's 2017 criteria and 20 healthy volunteers without any speech or voice pathology. Speech data obtained from patients and healthy individuals were analyzed with the PRAAT speech analysis program, and classification algorithms were tested to determine the most effective classifier in separating specific speech features of MS disease. As a result of the study, the K-nearest neighbor algorithm (K-NN) was found to be the most successful classifier (95%) in distinguishing pathological sounds which were seen in MS patients from those in healthy individuals. The findings obtained in our study can be considered as preliminary data to determine the voice characteristics of MS patients.

5.

Machine Learning-Assisted Speech Analysis for Early Detection of Parkinson's Disease: A Study on Speaker Diarization and Classification Techniques.

Di Cesare, Michele Giuseppe; Perpetuini, David; Cardone, Daniela; Merla, Arcangelo.

Sensors (Basel) ; 24(5)2024 Feb 26.

Artigo em Inglês | MEDLINE | ID: mdl-38475034

RESUMO

Parkinson's disease (PD) is a neurodegenerative disorder characterized by a range of motor and non-motor symptoms. One of the notable non-motor symptoms of PD is the presence of vocal disorders, attributed to the underlying pathophysiological changes in the neural control of the laryngeal and vocal tract musculature. From this perspective, the integration of machine learning (ML) techniques in the analysis of speech signals has significantly contributed to the detection and diagnosis of PD. Particularly, MEL Frequency Cepstral Coefficients (MFCCs) and Gammatone Frequency Cepstral Coefficients (GTCCs) are both feature extraction techniques commonly used in the field of speech and audio signal processing that could exhibit great potential for vocal disorder identification. This study presents a novel approach to the early detection of PD through ML applied to speech analysis, leveraging both MFCCs and GTCCs. The recordings contained in the Mobile Device Voice Recordings at King's College London (MDVR-KCL) dataset were used. These recordings were collected from healthy individuals and PD patients while they read a passage and during a spontaneous conversation on the phone. Particularly, the speech data regarding the spontaneous dialogue task were processed through speaker diarization, a technique that partitions an audio stream into homogeneous segments according to speaker identity. The ML applied to MFCCS and GTCCs allowed us to classify PD patients with a test accuracy of 92.3%. This research further demonstrates the potential to employ mobile phones as a non-invasive, cost-effective tool for the early detection of PD, significantly improving patient prognosis and quality of life.

Assuntos

Doença de Parkinson , Fala , Humanos , Doença de Parkinson/diagnóstico , Qualidade de Vida , Aprendizado de Máquina , Músculos Laríngeos

6.

Leveraging Deep Learning for Fine-Grained Categorization of Parkinson's Disease Progression Levels through Analysis of Vocal Acoustic Patterns.

Malekroodi, Hadi Sedigh; Madusanka, Nuwan; Lee, Byeong-Il; Yi, Myunggi.

Bioengineering (Basel) ; 11(3)2024 Mar 21.

Artigo em Inglês | MEDLINE | ID: mdl-38534569

RESUMO

Speech impairments often emerge as one of the primary indicators of Parkinson's disease (PD), albeit not readily apparent in its early stages. While previous studies focused predominantly on binary PD detection, this research explored the use of deep learning models to automatically classify sustained vowel recordings into healthy controls, mild PD, or severe PD based on motor symptom severity scores. Popular convolutional neural network (CNN) architectures, VGG and ResNet, as well as vision transformers, Swin, were fine-tuned on log mel spectrogram image representations of the segmented voice data. Furthermore, the research investigated the effects of audio segment lengths and specific vowel sounds on the performance of these models. The findings indicated that implementing longer segments yielded better performance. The models showed strong capability in distinguishing PD from healthy subjects, achieving over 95% precision. However, reliably discriminating between mild and severe PD cases remained challenging. The VGG16 achieved the best overall classification performance with 91.8% accuracy and the largest area under the ROC curve. Furthermore, focusing analysis on the vowel /u/ could further improve accuracy to 96%. Applying visualization techniques like Grad-CAM also highlighted how CNN models focused on localized spectrogram regions while transformers attended to more widespread patterns. Overall, this work showed the potential of deep learning for non-invasive screening and monitoring of PD progression from voice recordings, but larger multi-class labeled datasets are needed to further improve severity classification.

7.

Exploring the performance of automatic speaker recognition using twin speech and deep learning-based artificial neural networks.

Cavalcanti, Julio Cesar; da Silva, Ronaldo Rodrigues; Eriksson, Anders; Barbosa, Plinio A.

Front Artif Intell ; 7: 1287877, 2024.

Artigo em Inglês | MEDLINE | ID: mdl-38405218

RESUMO

This study assessed the influence of speaker similarity and sample length on the performance of an automatic speaker recognition (ASR) system utilizing the SpeechBrain toolkit. The dataset comprised recordings from 20 male identical twin speakers engaged in spontaneous dialogues and interviews. Performance evaluations involved comparing identical twins, all speakers in the dataset (including twin pairs), and all speakers excluding twin pairs. Speech samples, ranging from 5 to 30 s, underwent assessment based on equal error rates (EER) and Log cost-likelihood ratios (Cllr). Results highlight the substantial challenge posed by identical twins to the ASR system, leading to a decrease in overall speaker recognition accuracy. Furthermore, analyses based on longer speech samples outperformed those using shorter samples. As sample size increased, standard deviation values for both intra and inter-speaker similarity scores decreased, indicating reduced variability in estimating speaker similarity/dissimilarity levels in longer speech stretches compared to shorter ones. The study also uncovered varying degrees of likeness among identical twins, with certain pairs presenting a greater challenge for ASR systems. These outcomes align with prior research and are discussed within the context of relevant literature.

8.

Connected speech features in non-English speakers with Alzheimer's disease: protocol for scoping review.

Bose, Arpita; Ahmed, Samrah; Cheng, Yesi; Suárez-Gonzalez, Aida.

Syst Rev ; 13(1): 40, 2024 01 25.

Artigo em Inglês | MEDLINE | ID: mdl-38273377

RESUMO

BACKGROUND: A large body of literature indicates that connected speech profiles in patients with Alzheimer's disease (AD) can be utilized for diagnosis, disease monitoring, and for developing communication strategies for patients. Most connected speech research has been conducted in English, with little work in some European languages. Therefore, significant drawback remains with respect to the diversity of languages studied, and how the fragmentation of linguistic features differs across languages in AD. Accordingly, existing reviews on connected speech in AD have focused on findings from English-speaking patients; none have specifically focused on the linguistic diversity of AD populations. This scoping review is undertaken to provide the currently reported characteristics of connected speech in AD in languages other than English. It also seeks to identify the type of assessments, methods to elicit speech samples, type of analysis and linguistic frameworks used, and micro- and macro-linguistic features of speech reported in non-English speakers with AD. METHOD: We will conduct a scoping review of published studies that have quantitively assessed connected speech in AD in languages other than English. The inclusion criteria for the studies would be subject/s with a clinical diagnosis of AD. The search will include the electronic databases PubMed, Ovid-Embase, PsycINFO, Linguistic and Language Behaviour Abstracts (LLBA), and Web of Science up until March 2023. Findings will be mapped and described according to the languages studied, the methodology employed (e.g., patient characteristics, tasks used, linguistic analysis framework utilized), and connected speech profiles derived (e.g., micro- and macro-linguistic reported). DISCUSSION: The scoping review will provide an overview of languages studied in connected speech research in AD with variation in linguistic features across languages, thus allowing comparison with the established key features that distinguish AD patients from healthy controls. The findings will inform future research in connected speech in different languages to facilitate robust connected speech research in linguistically and ethnically diverse populations.

Assuntos

Doença de Alzheimer , Fala , Humanos , Idioma , Linguística , Literatura de Revisão como Assunto

9.

Speech timing and monosyllabic diadochokinesis measures in the assessment of amyotrophic lateral sclerosis in Canadian French.

Bouvier, Liziane; McKinlay, Scotia; Truong, Justin; Genge, Angela; Dupré, Nicolas; Dionne, Annie; Kalra, Sanjay; Yunusova, Yana.

Int J Speech Lang Pathol ; 26(2): 267-277, 2024 Apr.

Artigo em Inglês | MEDLINE | ID: mdl-37272348

RESUMO

PURPOSE: The primary objective of this study was to determine if speech and pause measures obtained using a passage reading task and timing measures from a monosyllabic diadochokinesis (DDK) task differ across speakers of Canadian French diagnosed with amyotrophic lateral sclerosis (ALS) presenting with and without bulbar symptoms, and healthy controls. The secondary objective was to determine if these measures can reflect the severity of bulbar symptoms. METHOD: A total of 29 Canadian French speakers with ALS (classified as bulbar symptomatic [n = 14] or pre-symptomatic [n = 15]) and 17 age-matched healthy controls completed a passage reading task and a monosyllabic DDK task (/pa/ and /ta/), for up to three follow-up visits. Measures of speaking rate, total duration, speech duration, and pause events were extracted from the passage reading recordings using a semi-automated speech and pause analysis procedure. Manual analysis of DDK recordings provided measures of DDK rate and variability. RESULT: Group comparisons revealed significant differences (p = < .05) between the symptomatic group and the pre-symptomatic and control groups for all passage measures and DDK rates. Only the DDK rate in /ta/ differentiated the pre-symptomatic and control groups. Repeated measures correlations revealed moderate correlations (rrm = > 0.40; p = < 0.05) between passage measures of total duration, speaking rate, speech duration, and number of pauses, and ALSFRS-R total and bulbar scores, as well as between DDK rate and ALSFRS-R total score. CONCLUSION: Speech and pause measures in passage and timing measures in monosyllabic DDK tasks might be suitable for monitoring bulbar functional symptoms in French speakers with ALS, but more work is required to identify which measures are sensitive to the earliest stages of the disease.

Assuntos

Esclerose Lateral Amiotrófica , Fala , Humanos , Esclerose Lateral Amiotrófica/complicações , Canadá , Medida da Produção da Fala/métodos , Idioma

10.

Predicting the cause of seizures using features extracted from interactions with a virtual agent.

Pevy, Nathan; Christensen, Heidi; Walker, Traci; Reuber, Markus.

Seizure ; 114: 84-89, 2024 Jan.

Artigo em Inglês | MEDLINE | ID: mdl-38091849

RESUMO

OBJECTIVE: A clinical decision tool for Transient Loss of Consciousness (TLOC) could reduce currently high misdiagnosis rates and waiting times for specialist assessments. Most clinical decision tools based on patient-reported symptom inventories only distinguish between two of the three most common causes of TLOC (epilepsy, functional /dissociative seizures, and syncope) or struggle with the particularly challenging differentiation between epilepsy and FDS. Based on previous research describing differences in spoken accounts of epileptic seizures and FDS seizures, this study explored the feasibility of predicting the cause of TLOC by combining the automated analysis of patient-reported symptoms and spoken TLOC descriptions. METHOD: Participants completed an online web application that consisted of a 34-item medical history and symptom questionnaire (iPEP) and spoken interaction with a virtual agent (VA) that asked eight questions about the most recent experience of TLOC. Support Vector Machines (SVM) were trained using different combinations of features and nested leave-one-out cross validation. The iPEP provided a baseline performance. Inspired by previous qualitative research three spoken language based feature sets were designed to assess: (1) formulation effort, (2) the proportion of words from different semantic categories, and (3) verb, adverb, and adjective usage. RESULTS: 76 participants completed the application (Epilepsy = 24, FDS = 36, syncope = 16). Only 61 participants also completed the VA interaction (Epilepsy = 20, FDS = 29, syncope = 12). The iPEP model accurately predicted 65.8 % of all diagnoses, but the inclusion of the language features increased the accuracy to 85.5 % by improving the differential diagnosis between epilepsy and FDS. CONCLUSION: These findings suggest that an automated analysis of TLOC descriptions collected using an online web application and VA could improve the accuracy of current clinical decisions tools for TLOC and facilitate clinical stratification processes (such as ensuring appropriate referral to cardiological versus neurological investigation and management pathways).

Assuntos

Epilepsia , Convulsões , Humanos , Convulsões/diagnóstico , Convulsões/complicações , Síncope/complicações , Inconsciência/diagnóstico , Epilepsia/diagnóstico , Epilepsia/complicações , Inquéritos e Questionários , Diagnóstico Diferencial

11.

A Compreensão da Intersubjetividade Entre um Adolescente Suicida e seus Pais Através da Combinação de Métodos na Análise do Discurso / La Comprensión dela intersubjetividadentre un adolescente suicida y sus padres mediante la combinación de métodos en el análisis del discurso / Understanding Intersubjectivity Between a Suicidal Adolescent and His Parents Through the Combination of Methods in Discourse Analysis

Gomes, Isabel Cristina; Santos, Washington Luiz Afonso Dos.

Subj. procesos cogn. ; 27(2): 161-197, dic. 12, 2023.

Artigo em Português | LILACS, UNISALUD, BINACIS | ID: biblio-1523139

RESUMO

Trata-se de um recorte de pesquisa de doutorado em que se intencionou estudar o vínculo intersubjetivo familiar em um adolescente com histórico de tentativa de suicídio, pautando-se nos desejos e defesas. Utilizou-se a combinação de três instrumentos (ADL-AH, ADL-R e o Genograma) para a análise do discurso. Após exame individual dos dados apontados por cada instrumento, realizou-se asanálises comparativas. Inicialmente, confrontou-se os resultados do ADL-AH com o ADL-R conforme a descrição do caso. Em seguida, o Genograma propiciou o entendimento intergeracional da família estudada, confirmou aspectos já levantados e elucidou pontos dos quais o ADL não poderia ter alcançado, embora esse último tenha encontrado discrepâncias no discurso, isto é, aquilo que se pretendia disfarçar ou esconder de forma consciente ou inconsciente. A combinaçãodos três instrumentos trouxe consistência para a investigação no sentido de uma melhor compreensão da intersubjetividade do adolescente, sua família e do comportamento suicida AU

Este es un extracto de una investigación doctoral que tuvo como objetivo estudiar el vínculo familiar intersubjetivo en un adolescente con antecedentes de intento de suicidio, basado en deseos y defensas. Para el análisis del discurso se utilizó una combinación de tres instrumentos (ADL-AH, ADL-R y Genograma). Luego de examinar individualmente los datos indicados por cada instrumento, se realizaron comparaciones. Inicialmente se confrontaran los resultados del ADL-AH con el ADL-R según la descripción del caso. Luego, el Genograma proporcionó una comprensión intergeneracional de la familia estudiada, confirmó aspectos ya planteados y aclaró puntos que la ADL no podría haber logrado, aunque este último tenga encontrado discrepancias en el discurso, es decir, lo que se pretendía disfrazar u ocultar consciente o inconscientemente. Esta combinación aportó consistencia a la investigación y con ella fue posible tener una mejor comprensión de la intersubjetividad, del adolescente, su familia y la conducta suicida AU

Assuntos

Humanos , Masculino , Adolescente , Tentativa de Suicídio/psicologia , Psicologia do Adolescente , Narrativas Pessoais como Assunto , Terapia Psicanalítica/métodos

12.

Measurement of neuropsychiatric symptoms in the older adults with mild cognitive impairment based on speech and facial expressions: a cross-sectional observational study.

Zhou, Ying; Yao, Xiuyu; Han, Wei; Li, Yingxin; Xue, Jiajun; Li, Zheng.

Aging Ment Health ; : 1-10, 2023 Nov 16.

Artigo em Inglês | MEDLINE | ID: mdl-37970813

RESUMO

OBJECTIVES: To examine the association between speech and facial features with depression, anxiety, and apathy in older adults with mild cognitive impairment (MCI). METHODS: Speech and facial expressions of 319 MCI patients were digitally recorded via audio and video recording software. Three of the most common neuropsychiatric symptoms (NPS) were evaluated by the Public Health Questionnaire, General Anxiety Disorder, and Apathy Evaluation Scale, respectively. Speech and facial features were extracted using the open-source data analysis toolkits. Machine learning techniques were used to validate the diagnostic power of extracted features. RESULTS: Different speech and facial features were associated with specific NPS. Depression was associated with spectral and temporal features, anxiety and apathy with frequency, energy, spectral, and temporal features. Additionally, depression was associated with facial features (action unit, AU) 10, 12, 15, 17, 25, anxiety with AU 10, 15, 17, 25, 26, 45, and apathy with AU 5, 26, 45. Significant differences in speech and facial features were observed between males and females. Based on machine learning models, the highest accuracy for detecting depression, anxiety, and apathy reached 95.8%, 96.1%, and 83.3% for males, and 87.8%, 88.2%, and 88.6% for females, respectively. CONCLUSION: Depression, anxiety, and apathy were characterized by distinct speech and facial features. The machine learning model developed in this study demonstrated good classification in detecting depression, anxiety, and apathy. A combination of audio and video may provide objective methods for the precise classification of these symptoms.

13.

Using Wearable Devices and Speech Data for Personalized Machine Learning in Early Detection of Mental Disorders: Protocol for a Participatory Research Study.

Diaz-Ramos, Ramon E; Noriega, Isabella; Trejo, Luis A; Stroulia, Eleni; Cao, Bo.

JMIR Res Protoc ; 12: e48210, 2023 Nov 13.

Artigo em Inglês | MEDLINE | ID: mdl-37955959

RESUMO

BACKGROUND: Early identification of mental disorder symptoms is crucial for timely treatment and reduction of recurring symptoms and disabilities. A tool to help individuals recognize warning signs is important. We posit that such a tool would have to rely on longitudinal analysis of patterns and trends in the individual's daily activities and mood, which can now be captured through data from wearable activity trackers, speech recordings from mobile devices, and the individual's own description of their mental state. In this paper, we describe such a tool developed by our team to detect early signs of depression, anxiety, and stress. OBJECTIVE: This study aims to examine three questions about the effectiveness of machine learning models constructed based on multimodal data from wearables, speech, and self-reports: (1) How does speech about issues of personal context differ from speech while reading a neutral text, what type of speech data are more helpful in detecting mental health indicators, and how is the quality of the machine learning models influenced by multilanguage data? (2) Does accuracy improve with longitudinal data collection and how, and what are the most important features? and (3) How do personalized machine learning models compare against population-level models? METHODS: We collect longitudinal data to aid machine learning in accurately identifying patterns of mental disorder symptoms. We developed an app that collects voice, physiological, and activity data. Physiological and activity data are provided by a variety of off-the-shelf fitness trackers, that record steps, active minutes, duration of sleeping stages (rapid eye movement, deep, and light sleep), calories consumed, distance walked, heart rate, and speed. We also collect voice recordings of users reading specific texts and answering open-ended questions chosen randomly from a set of questions without repetition. Finally, the app collects users' answers to the Depression, Anxiety, and Stress Scale. The collected data from wearable devices and voice recordings will be used to train machine learning models to predict the levels of anxiety, stress, and depression in participants. RESULTS: The study is ongoing, and data collection will be completed by November 2023. We expect to recruit at least 50 participants attending 2 major universities (in Canada and Mexico) fluent in English or Spanish. The study will include participants aged between 18 and 35 years, with no communication disorders, acute neurological diseases, or history of brain damage. Data collection complied with ethical and privacy requirements. CONCLUSIONS: The study aims to advance personalized machine learning for mental health; generate a data set to predict Depression, Anxiety, and Stress Scale results; and deploy a framework for early detection of depression, anxiety, and stress. Our long-term goal is to develop a noninvasive and objective method for collecting mental health data and promptly detecting mental disorder symptoms. INTERNATIONAL REGISTERED REPORT IDENTIFIER (IRRID): DERR1-10.2196/48210.

14.

Validation of an Automated Speech Analysis of Cognitive Tasks within a Semiautomated Phone Assessment.

Ter Huurne, Daphne; Possemis, Nina; Banning, Leonie; Gruters, Angélique; König, Alexandra; Linz, Nicklas; Tröger, Johannes; Langel, Kai; Verhey, Frans; de Vugt, Marjolein; Ramakers, Inez.

Digit Biomark ; 7(1): 115-123, 2023.

Artigo em Inglês | MEDLINE | ID: mdl-37901366

RESUMO

Introduction: We studied the accuracy of the automatic speech recognition (ASR) software by comparing ASR scores with manual scores from a verbal learning test (VLT) and a semantic verbal fluency (SVF) task in a semiautomated phone assessment in a memory clinic population. Furthermore, we examined the differentiating value of these tests between participants with subjective cognitive decline (SCD) and mild cognitive impairment (MCI). We also investigated whether the automatically calculated speech and linguistic features had an additional value compared to the commonly used total scores in a semiautomated phone assessment. Methods: We included 94 participants from the memory clinic of the Maastricht University Medical Center+ (SCD N = 56 and MCI N = 38). The test leader guided the participant through a semiautomated phone assessment. The VLT and SVF were audio recorded and processed via a mobile application. The recall count and speech and linguistic features were automatically extracted. The diagnostic groups were classified by training machine learning classifiers to differentiate SCD and MCI participants. Results: The intraclass correlation for inter-rater reliability between the manual and the ASR total word count was 0.89 (95% CI 0.09-0.97) for the VLT immediate recall, 0.94 (95% CI 0.68-0.98) for the VLT delayed recall, and 0.93 (95% CI 0.56-0.97) for the SVF. The full model including the total word count and speech and linguistic features had an area under the curve of 0.81 and 0.77 for the VLT immediate and delayed recall, respectively, and 0.61 for the SVF. Conclusion: There was a high agreement between the ASR and manual scores, keeping the broad confidence intervals in mind. The phone-based VLT was able to differentiate between SCD and MCI and can have opportunities for clinical trial screening.

15.

Neuronetwork Approach in the Early Diagnosis of Depression.

Astafeva, Darya; Gayduk, Arseny; Tavormina, Giuseppe; Syunyakov, Timur; Chigareva, Oxana; Bikbaeva, Kseniya; Markina, Ekaterina; Vlasov, Andrei; Yashikhina, Anna; Zhovnerchuk, Evgeny; Kolsanov, Aleksandr; Smirnova, Daria.

Psychiatr Danub ; 35(Suppl 2): 77-85, 2023 Oct.

Artigo em Inglês | MEDLINE | ID: mdl-37800207

RESUMO

BACKGROUND: Depression is a common mental illness, with around 280 million people suffering from depression worldwide. At present, the main way to quantify the severity of depression is through psychometric scales, which entail subjectivity on the part of both patient and clinician. In the last few years, deep (machine) learning is emerging as a more objective approach for measuring depression severity. We now investigate how neural networks might serve for the early diagnosis of depression. SUBJECTS AND METHODS: We searched Medline (Pubmed) for articles published up to June 1, 2023. The search term included Depression AND Diagnostics AND Artificial Intelligence. We did not search for depression studies of machine learning other than neural networks, and selected only those papers attesting to diagnosis or screening for depression. RESULTS: Fifty-four papers met our criteria, among which 14 using facial expression recordings, 14 using EEG, 5 using fMRI, and 5 using audio speech recording analysis, whereas 6 used multimodality approach, two were the text analysis studies, and 8 used other methods. CONCLUSIONS: Research methodologies include both audio and video recordings of clinical interviews, task performance, including their subsequent conversion into text, and resting state studies (EEG, MRI, fMRI). Convolutional neural networks (CNN), including 3D-CNN and 2D-CNN, can obtain diagnostic data from the videos of the facial area. Deep learning in relation to EEG signals is the most commonly used CNN. fMRI approaches use graph convolutional networks and 3D-CNN with voxel connectivity, whereas the text analyses use CNNs, including LSTM (long/short-term memory). Audio recordings are analyzed by a hybrid CNN and support vector machine model. Neural networks are used to analyze biomaterials, gait, polysomnography, ECG, data from wrist wearable devices, and present illness history records. Multimodality studies analyze the fusion of audio features with visual and textual features using LSTM and CNN architectures, a temporal convolutional network, or a recurrent neural network. The accuracy of different hybrid and multimodality models is 78-99%, relative to the standard clinical diagnoses.

Assuntos

Inteligência Artificial , Depressão , Humanos , Depressão/diagnóstico , Redes Neurais de Computação , Aprendizado de Máquina , Diagnóstico Precoce

16.

A Study on Speech Analysis in Acquired Maxillary Defect Patients Treated with Maxillary Obturator.

Bhushan, Purnendu; Raj, Kavita; Hota, Sadananda; Mishra, Debasish; Raut, Anjana; Mohanty, Arun K.

J Pharm Bioallied Sci ; 15(Suppl 1): S467-S470, 2023 Jul.

Artigo em Inglês | MEDLINE | ID: mdl-37654416

RESUMO

Aim: The aim of the present study was to assess the speech analysis in acquired maxillary defect patients treated with maxillary obturator. Materials and Methods: Total of 16 patients were considered in the study. The age group of these patients ranged from 40 to 75 years with a mean age of 59.5 years, irrespective of their gender. The surgical obturator was fabricated using self-cure acrylic. The surgical obturator was delivered immediately after surgery. After a healing period of about 2 weeks, the surgical obturator was replaced by an interim prosthesis. This was processed with the help of heat-cure polymethylmethacrylate. The total number of patients was divided into two groups, namely, (A) Definitive obturator group and (B) Interim obturator group. The speech intelligibility (SI), was analyzed. Results: The mean scores for SI before prosthesis in definitive and interim groups were 19.13 ± 3.22 and 19.87 ± 1.72, respectively. This was increased after prosthesis insertion to 24.38 ± 1.30 and 22.37 ± 1.18, which further increased after adaptation period of 2 months to 28.75 ± 1.28 and 24.62 ± 1.59 in two groups. Conclusion: The present study concluded that speech was severely affected by maxillary resection and that rehabilitation with maxillary obturator was successful in restoring these aspects of speech.

17.

ADscreen: A speech processing-based screening system for automatic identification of patients with Alzheimer's disease and related dementia.

Zolnoori, Maryam; Zolnour, Ali; Topaz, Maxim.

Artif Intell Med ; 143: 102624, 2023 09.

Artigo em Inglês | MEDLINE | ID: mdl-37673583

RESUMO

Alzheimer's disease and related dementias (ADRD) present a looming public health crisis, affecting roughly 5 million people and 11 % of older adults in the United States. Despite nationwide efforts for timely diagnosis of patients with ADRD, >50 % of them are not diagnosed and unaware of their disease. To address this challenge, we developed ADscreen, an innovative speech-processing based ADRD screening algorithm for the protective identification of patients with ADRD. ADscreen consists of five major components: (i) noise reduction for reducing background noises from the audio-recorded patient speech, (ii) modeling the patient's ability in phonetic motor planning using acoustic parameters of the patient's voice, (iii) modeling the patient's ability in semantic and syntactic levels of language organization using linguistic parameters of the patient speech, (iv) extracting vocal and semantic psycholinguistic cues from the patient speech, and (v) building and evaluating the screening algorithm. To identify important speech parameters (features) associated with ADRD, we used the Joint Mutual Information Maximization (JMIM), an effective feature selection method for high dimensional, small sample size datasets. Modeling the relationship between speech parameters and the outcome variable (presence/absence of ADRD) was conducted using three different machine learning (ML) architectures with the capability of joining informative acoustic and linguistic with contextual word embedding vectors obtained from the DistilBERT (Bidirectional Encoder Representations from Transformers). We evaluated the performance of the ADscreen on an audio-recorded patients' speech (verbal description) for the Cookie-Theft picture description task, which is publicly available in the dementia databank. The joint fusion of acoustic and linguistic parameters with contextual word embedding vectors of DistilBERT achieved F1-score = 84.64 (standard deviation [std] = ±3.58) and AUC-ROC = 92.53 (std = ±3.34) for training dataset, and F1-score = 89.55 and AUC-ROC = 93.89 for the test dataset. In summary, ADscreen has a strong potential to be integrated with clinical workflow to address the need for an ADRD screening tool so that patients with cognitive impairment can receive appropriate and timely care.

Assuntos

Doença de Alzheimer , Programas de Rastreamento , Idoso , Humanos , Acústica , Doença de Alzheimer/diagnóstico , Doença de Alzheimer/prevenção & controle , Linguística , Fala , Programas de Rastreamento/métodos

18.

Dissociating memory and executive function impairment through temporal features in a word list verbal learning task.

Dörr, Felix; Schäfer, Simona; Öhman, Fredrik; Linz, Nicklas; Bodin, Timothy Hadarsson; Skoog, Johan; Zettergren, Anna; Kern, Silke; Skoog, Ingmar; Tröger, Johannes.

Neuropsychologia ; 189: 108679, 2023 Oct 10.

Artigo em Inglês | MEDLINE | ID: mdl-37683887

RESUMO

The Rey Auditory Verbal Learning Test (RAVLT) is an established verbal learning test commonly used to quantify memory impairments due to Alzheimer's Disease (AD) both at a clinical dementia stage or prodromal stage of mild cognitive impairment (MCI). Focal memory impairment-as quantified e.g. by the RAVLT-at an MCI stage is referred to as amnestic MCI (aMCI) and is often regarded as the cognitive phenotype of prodromal AD. However, recent findings suggest that not only learning and memory but also other cognitive domains, especially executive functions (EF) and processing speed (PS), influence verbal learning performance. This research investigates whether additional temporal features extracted from audio recordings from a participant's RAVLT response can better dissociate memory and EF in such tasks and eventually help to better describe MCI subtypes. 675 age-matched participants from the H70 Swedish birth cohort were included in this analysis; 68 participants were classified as MCI (33 aMCI and 35 due to executive impairment). RAVLT performances were recorded and temporal features extracted. Novel temporal features were correlated with established neuropsychological tests measuring EF and PS. Lastly, the downstream diagnostic potential of temporal features was estimated using group differences and a machine learning (ML) classification scenario. Temporal features correlated moderately with measures of EF and PS. Performance of an ML classifier could be improved by adding temporal features to traditional counts. We conclude that RAVLT temporal features are in general related to EF and that they might be capable of dissociating memory and EF in a word list learning task.

19.

Evaluating the clinical utility of speech analysis and machine learning in schizophrenia: A pilot study.

Huang, Jie; Zhao, Yanli; Tian, Zhanxiao; Qu, Wei; Du, Xia; Zhang, Jie; Tan, Yunlong; Wang, Zhiren; Tan, Shuping.

Comput Biol Med ; 164: 107359, 2023 09.

Artigo em Inglês | MEDLINE | ID: mdl-37591160

RESUMO

BACKGROUND: Schizophrenia is a serious mental disorder that significantly impacts social functioning and quality of life. However, current diagnostic methods lack objective biomarker support. While some studies have indicated differences in audio features between patients with schizophrenia and healthy controls, these findings are influenced by demographic information and variations in experimental paradigms. Therefore, it is crucial to explore stable and reliable audio biomarkers for an auxiliary diagnosis and disease severity prediction of schizophrenia. METHOD: A total of 130 individuals (65 patients with schizophrenia and 65 healthy controls) read three fixed texts containing positive, neutral, and negative emotions, and recorded them. All audio signals were preprocessed and acoustic features were extracted by a librosa-0.9.2 toolkit. Independent sample t-tests were performed on two sets of acoustic features, and Pearson correlation on the acoustic features and Positive and Negative Syndrome Scale (PANSS) scores of the schizophrenia group. Classification algorithms in scikit-learn were used to diagnose schizophrenia and predict the level of negative symptoms. RESULTS: Significant differences were observed between the two groups in the mfcc_8, mfcc_11, and mfcc_33 of mel-frequency cepstral coefficient (MFCC). Furthermore, a significant correlation was found between mfcc_7 and the negative PANSS scores. Through acoustic features, we could not only differentiate patients with schizophrenia from healthy controls with an accuracy of 0.815 but also predict the grade of the negative symptoms in schizophrenia with an average accuracy of 0.691. CONCLUSIONS: The results demonstrated the considerable potential of acoustic characteristics as reliable biomarkers for diagnosing schizophrenia and predicting clinical symptoms.

Assuntos

Qualidade de Vida , Esquizofrenia , Humanos , Projetos Piloto , Esquizofrenia/diagnóstico , Fala , Aprendizado de Máquina

20.

Validation of natural language processing methods capturing semantic incoherence in the speech of patients with non-affective psychosis.

Just, Sandra Anna; Bröcker, Anna-Lena; Ryazanskaya, Galina; Nenchev, Ivan; Schneider, Maria; Bermpohl, Felix; Heinz, Andreas; Montag, Christiane.

Front Psychiatry ; 14: 1208856, 2023.

Artigo em Inglês | MEDLINE | ID: mdl-37564246

RESUMO

Background: Impairments in speech production are a core symptom of non-affective psychosis (NAP). While traditional clinical ratings of patients' speech involve a subjective human factor, modern methods of natural language processing (NLP) promise an automatic and objective way of analyzing patients' speech. This study aimed to validate NLP methods for analyzing speech production in NAP patients. Methods: Speech samples from patients with a diagnosis of schizophrenia or schizoaffective disorder were obtained at two measurement points, 6 months apart. Out of N = 71 patients at T1, speech samples were also available for N = 54 patients at T2. Global and local models of semantic coherence as well as different word embeddings (word2vec vs. GloVe) were applied to the transcribed speech samples. They were tested and compared regarding their correlation with clinical ratings and external criteria from cross-sectional and longitudinal measurements. Results: Results did not show differences for global vs. local coherence models and found more significant correlations between word2vec models and clinically relevant outcome variables than for GloVe models. Exploratory analysis of longitudinal data did not yield significant correlation with coherence scores. Conclusion: These results indicate that natural language processing methods need to be critically validated in more studies and carefully selected before clinical application.

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

RESUMO

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA