Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 3 de 3
Filter
Add more filters










Database
Language
Publication year range
1.
JMIR Res Protoc ; 12: e51912, 2023 Oct 23.
Article in English | MEDLINE | ID: mdl-37870890

ABSTRACT

BACKGROUND: Providing Psychotherapy, particularly for youth, is a pressing challenge in the health care system. Traditional methods are resource-intensive, and there is a need for objective benchmarks to guide therapeutic interventions. Automated emotion detection from speech, using artificial intelligence, presents an emerging approach to address these challenges. Speech can carry vital information about emotional states, which can be used to improve mental health care services, especially when the person is suffering. OBJECTIVE: This study aims to develop and evaluate automated methods for detecting the intensity of emotions (anger, fear, sadness, and happiness) in audio recordings of patients' speech. We also demonstrate the viability of deploying the models. Our model was validated in a previous publication by Alemu et al with limited voice samples. This follow-up study used significantly more voice samples to validate the previous model. METHODS: We used audio recordings of patients, specifically children with high adverse childhood experience (ACE) scores; the average ACE score was 5 or higher, at the highest risk for chronic disease and social or emotional problems; only 1 in 6 have a score of 4 or above. The patients' structured voice sample was collected by reading a fixed script. In total, 4 highly trained therapists classified audio segments based on a scoring process of 4 emotions and their intensity levels for each of the 4 different emotions. We experimented with various preprocessing methods, including denoising, voice-activity detection, and diarization. Additionally, we explored various model architectures, including convolutional neural networks (CNNs) and transformers. We trained emotion-specific transformer-based models and a generalized CNN-based model to predict emotion intensities. RESULTS: The emotion-specific transformer-based model achieved a test-set precision and recall of 86% and 79%, respectively, for binary emotional intensity classification (high or low). In contrast, the CNN-based model, generalized to predict the intensity of 4 different emotions, achieved test-set precision and recall of 83% for each. CONCLUSIONS: Automated emotion detection from patients' speech using artificial intelligence models is found to be feasible, leading to a high level of accuracy. The transformer-based model exhibited better performance in emotion-specific detection, while the CNN-based model showed promise in generalized emotion detection. These models can serve as valuable decision-support tools for pediatricians and mental health providers to triage youth to appropriate levels of mental health care services. INTERNATIONAL REGISTERED REPORT IDENTIFIER (IRRID): RR1-10.2196/51912.

2.
JMIR Res Protoc ; 12: e46970, 2023 Jun 23.
Article in English | MEDLINE | ID: mdl-37351936

ABSTRACT

BACKGROUND: Even before the onset of the COVID-19 pandemic, children and adolescents were experiencing a mental health crisis, partly due to a lack of quality mental health services. The rate of suicide for Black youth has increased by 80%. By 2025, the health care system will be short of 225,000 therapists, further exacerbating the current crisis. Therefore, it is of utmost importance for providers, schools, youth mental health, and pediatric medical providers to integrate innovation in digital mental health to identify problems proactively and rapidly for effective collaboration with other health care providers. Such approaches can help identify robust, reproducible, and generalizable predictors and digital biomarkers of treatment response in psychiatry. Among the multitude of digital innovations to identify a biomarker for psychiatric diseases currently, as part of the macrolevel digital health transformation, speech stands out as an attractive candidate with features such as affordability, noninvasive, and nonintrusive. OBJECTIVE: The protocol aims to develop speech-emotion recognition algorithms leveraging artificial intelligence/machine learning, which can establish a link between trauma, stress, and voice types, including disrupting speech-based characteristics, and detect clinically relevant emotional distress and functional impairments in children and adolescents. METHODS: Informed by theoretical foundations (the Theory of Psychological Trauma Biomarkers and Archetypal Voice Categories), we developed our methodology to focus on 5 emotions: anger, happiness, fear, neutral, and sadness. Participants will be recruited from 2 local mental health centers that serve urban youths. Speech samples, along with responses to the Symptom and Functioning Severity Scale, Patient Health Questionnaire 9, and Adverse Childhood Experiences scales, will be collected using an Android mobile app. Our model development pipeline is informed by Gaussian mixture model (GMM), recurrent neural network, and long short-term memory. RESULTS: We tested our model with a public data set. The GMM with 128 clusters showed an evenly distributed accuracy across all 5 emotions. Using utterance-level features, GMM achieved an accuracy of 79.15% overall, while frame selection increased accuracy to 85.35%. This demonstrates that GMM is a robust model for emotion classification of all 5 emotions and that emotion frame selection enhances accuracy, which is significant for scientific evaluation. Recruitment and data collection for the study were initiated in August 2021 and are currently underway. The study results are likely to be available and published in 2024. CONCLUSIONS: This study contributes to the literature as it addresses the need for speech-focused digital health tools to detect clinically relevant emotional distress and functional impairments in children and adolescents. The preliminary results show that our algorithm has the potential to improve outcomes. The findings will contribute to the broader digital health transformation. INTERNATIONAL REGISTERED REPORT IDENTIFIER (IRRID): DERR1-10.2196/46970.

3.
Autism Res ; 12(4): 628-635, 2019 04.
Article in English | MEDLINE | ID: mdl-30638310

ABSTRACT

The LENA system was designed and validated to provide information about the language environment in children 0 to 4 years of age and its use has been expanded to populations with a number of communication profiles. Its utility in children 5 years of age and older is not yet known. The present study used acoustic data from two samples of children with autism spectrum disorders (ASD) to evaluate the reliability of LENA automated analyses for detecting speech utterances in older, school age children, and adolescents with ASD, in clinic and home environments. Participants between 5 and 18 years old who were minimally verbal (study 1) or had a range of verbal abilities (study 2) completed standardized assessments in the clinic (study 1 and 2) and in the home (study 2) while speech was recorded from a LENA device. We compared LENA segment labels with manual ground truth coding by human transcribers using two different methods. We found that the automated LENA algorithms were not successful (<50% reliable) in detecting vocalizations from older children and adolescents with ASD, and that the proportion of speaker misclassifications by the automated system increased significantly with the target-child's age. The findings in children and adolescents with ASD suggest possibly misleading results when expanding the use of LENA beyond the age ranges for which it was developed and highlight the need to develop novel automated methods that are more appropriate for older children. Autism Research 2019, 12: 628-635. © 2019 International Society for Autism Research, Wiley Periodicals, Inc. LAY SUMMARY: Current commercially available speech detection algorithms (LENA system) were previously validated in toddlers and children up to 48 months of age, and it is not known whether they are reliable in older children and adolescents. Our data suggest that LENA does not adequately capture speech in school age children and adolescents with autism and highlights the need to develop new automated methods for older children.


Subject(s)
Algorithms , Autism Spectrum Disorder/complications , Autism Spectrum Disorder/physiopathology , Language Development Disorders/complications , Language Development Disorders/diagnosis , Speech/physiology , Adolescent , Child , Child, Preschool , Female , Humans , Language Development Disorders/physiopathology , Male , Reproducibility of Results , Software
SELECTION OF CITATIONS
SEARCH DETAIL
...