Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 74
Filtrar
1.
J Speech Lang Hear Res ; : 1-7, 2024 Jun 05.
Artigo em Inglês | MEDLINE | ID: mdl-38838248

RESUMO

OBJECTIVE: This research note advocates for a methodological shift in clinical speech analytics, emphasizing the transition from high-dimensional speech feature representations to clinically validated speech measures designed to operationalize clinically relevant constructs of interest. The aim is to enhance model generalizability and clinical applicability in real-world settings. METHOD: We outline the challenges of using conventional supervised machine learning models in clinical speech analytics, particularly their limited generalizability and interpretability. We propose a new framework focusing on speech measures that are closely tied to specific speech constructs and have undergone rigorous validation. This research note discusses a case study involving the development of a measure for articulatory precision in amyotrophic lateral sclerosis (ALS), detailing the process from ideation through Food and Drug Administration (FDA) breakthrough status designation. RESULTS: The case study demonstrates how the operationalization of the articulatory precision construct into a quantifiable measure yields robust, clinically meaningful results. The measure's validation followed the V3 framework (verification, analytical validation, and clinical validation), showing high correlation with clinical status and speech intelligibility. The practical application of these measures is exemplified in a clinical trial and designation by the FDA as a breakthrough status device, underscoring their real-world impact. CONCLUSIONS: Transitioning from speech features to speech measures offers a more targeted approach for developing speech analytics tools in clinical settings. This shift ensures that models are not only technically sound but also clinically relevant and interpretable, thereby bridging the gap between laboratory research and practical health care applications. We encourage further exploration and adoption of this approach for developing interpretable speech representations tailored to specific clinical needs.

2.
J Speech Lang Hear Res ; : 1-24, 2024 Jun 26.
Artigo em Inglês | MEDLINE | ID: mdl-38924389

RESUMO

PURPOSE: This study explores speech motor planning in adults who stutter (AWS) and adults who do not stutter (ANS) by applying machine learning algorithms to electroencephalographic (EEG) signals. In this study, we developed a technique to holistically examine neural activity differences in speaking and silent reading conditions across the entire cortical surface. This approach allows us to test the hypothesis that AWS will exhibit lower separability of the speech motor planning condition. METHOD: We used the silent reading condition as a control condition to isolate speech motor planning activity. We classified EEG signals from AWS and ANS individuals into speaking and silent reading categories using kernel support vector machines. We used relative complexities of the learned classifiers to compare speech motor planning discernibility for both classes. RESULTS: AWS group classifiers require a more complex decision boundary to separate speech motor planning and silent reading classes. CONCLUSIONS: These findings indicate that the EEG signals associated with speech motor planning are less discernible in AWS, which may result from altered neuronal dynamics in AWS. Our results support the hypothesis that AWS exhibit lower inherent separability of the silent reading and speech motor planning conditions. Further investigation may identify and compare the features leveraged for speech motor classification in AWS and ANS. These observations may have clinical value for developing novel speech therapies or assistive devices for AWS.

3.
Artigo em Inglês | MEDLINE | ID: mdl-38932502

RESUMO

Objective: Although studies have shown that digital measures of speech detected ALS speech impairment and correlated with the ALSFRS-R speech item, no study has yet compared their performance in detecting speech changes. In this study, we compared the performances of the ALSFRS-R speech item and an algorithmic speech measure in detecting clinically important changes in speech. Importantly, the study was part of a FDA submission which received the breakthrough device designation for monitoring ALS; we provide this paper as a roadmap for validating other speech measures for monitoring disease progression. Methods: We obtained ALSFRS-R speech subscores and speech samples from participants with ALS. We computed the minimum detectable change (MDC) of both measures; using clinician-reported listener effort and a perceptual ratings of severity, we calculated the minimal clinically important difference (MCID) of each measure with respect to both sets of clinical ratings. Results: For articulatory precision, the MDC (.85) was lower than both MCID measures (2.74 and 2.28), and for the ALSFRS-R speech item, MDC (.86) was greater than both MCID measures (.82 and .72), indicating that while the articulatory precision measure detected minimal clinically important differences in speech, the ALSFRS-R speech item did not. Conclusion: The results demonstrate that the digital measure of articulatory precision effectively detects clinically important differences in speech ratings, outperforming the ALSFRS-R speech item. Taken together, the results herein suggest that this speech outcome is a clinically meaningful measure of speech change.

4.
Sci Rep ; 13(1): 20224, 2023 11 18.
Artigo em Inglês | MEDLINE | ID: mdl-37980431

RESUMO

Cigna's online stress management toolkit includes an AI-based tool that purports to evaluate a person's psychological stress level based on analysis of their speech, the Cigna StressWaves Test (CSWT). In this study, we evaluate the claim that the CSWT is a "clinical grade" tool via an independent validation. The results suggest that the CSWT is not repeatable and has poor convergent validity; the public availability of the CSWT despite insufficient validation data highlights concerns regarding premature deployment of digital health tools for stress and anxiety management.


Assuntos
Inteligência Artificial , Fala , Humanos , Reprodutibilidade dos Testes
5.
Artigo em Inglês | MEDLINE | ID: mdl-37899766

RESUMO

Approximately 1.2% of the world's population has impaired voice production. As a result, automatic dysphonic voice detection has attracted considerable academic and clinical interest. However, existing methods for automated voice assessment often fail to generalize outside the training conditions or to other related applications. In this paper, we propose a deep learning framework for generating acoustic feature embeddings sensitive to vocal quality and robust across different corpora. A contrastive loss is combined with a classification loss to train our deep learning model jointly. Data warping methods are used on input voice samples to improve the robustness of our method. Empirical results demonstrate that our method not only achieves high in-corpus and cross-corpus classification accuracy but also generates good embeddings sensitive to voice quality and robust across different corpora. We also compare our results against three baseline methods on clean and three variations of deteriorated in-corpus and cross-corpus datasets and demonstrate that the proposed model consistently outperforms the baseline methods.

6.
J Speech Lang Hear Res ; 66(8S): 3166-3181, 2023 08 17.
Artigo em Inglês | MEDLINE | ID: mdl-37556308

RESUMO

PURPOSE: Oral diadochokinesis is a useful task in assessment of speech motor function in the context of neurological disease. Remote collection of speech tasks provides a convenient alternative to in-clinic visits, but scoring these assessments can be a laborious process for clinicians. This work describes Wav2DDK, an automated algorithm for estimating the diadochokinetic (DDK) rate on remotely collected audio from healthy participants and participants with amyotrophic lateral sclerosis (ALS). METHOD: Wav2DDK was developed using a corpus of 970 DDK assessments from healthy and ALS speakers where ground truth DDK rates were provided manually by trained annotators. The clinical utility of the algorithm was demonstrated on a corpus of 7,919 assessments collected longitudinally from 26 healthy controls and 82 ALS speakers. Corpora were collected via the participants' own mobile device, and instructions for speech elicitation were provided via a mobile app. DDK rate was estimated by parsing the character transcript from a deep neural network transformer acoustic model trained on healthy and ALS speech. RESULTS: Algorithm estimated DDK rates are highly accurate, achieving .98 correlation with manual annotation, and an average error of only 0.071 syllables per second. The rate exactly matched ground truth for 83% of files and was within 0.5 syllables per second for 95% of files. Estimated rates achieve a high test-retest reliability (r = .95) and show good correlation with the revised ALS functional rating scale speech subscore (r = .67). CONCLUSION: We demonstrate a system for automated DDK estimation that increases efficiency of calculation beyond manual annotation. Thorough analytical and clinical validation demonstrates that the algorithm is not only highly accurate, but also provides a convenient, clinically relevant metric for tracking longitudinal decline in ALS, serving to promote participation and diversity of participants in clinical research. SUPPLEMENTAL MATERIAL: https://doi.org/10.23641/asha.23787033.


Assuntos
Esclerose Lateral Amiotrófica , Fala , Humanos , Reprodutibilidade dos Testes , Testes de Articulação da Fala , Algoritmos
7.
Artigo em Inglês | MEDLINE | ID: mdl-37309077

RESUMO

Objective: We demonstrated that it was possible to predict ALS patients' degree of future speech impairment based on past data. We used longitudinal data from two ALS studies where participants recorded their speech on a daily or weekly basis and provided ALSFRS-R speech subscores on a weekly or quarterly basis (quarter-annually). Methods: Using their speech recordings, we measured articulatory precision (a measure of the crispness of pronunciation) using an algorithm that analyzed the acoustic signal of each phoneme in the words produced. First, we established the analytical and clinical validity of the measure of articulatory precision, showing that the measure correlated with perceptual ratings of articulatory precision (r = .9). Second, using articulatory precision from speech samples from each participant collected over a 45-90 day model calibration period, we showed it was possible to predict articulatory precision 30-90 days after the last day of the model calibration period. Finally, we showed that the predicted articulatory precision scores mapped onto ALSFRS-R speech subscores. Results: the mean absolute error was as low as 4% for articulatory precision and 14% for ALSFRS-R speech subscores relative to the total range of their respective scales. Conclusion: Our results demonstrated that a subject-specific prognostic model for speech predicts future articulatory precision and ALSFRS-R speech values accurately.

8.
J Speech Lang Hear Res ; 66(8S): 3132-3150, 2023 08 17.
Artigo em Inglês | MEDLINE | ID: mdl-37071795

RESUMO

PURPOSE: Defined as the similarity of speech behaviors between interlocutors, speech entrainment plays an important role in successful adult conversations. According to theoretical models of entrainment and research on motoric, cognitive, and social developmental milestones, the ability to entrain should develop throughout adolescence. However, little is known about the specific developmental trajectory or the role of speech entrainment in conversational outcomes of this age group. The purpose of this study is to characterize speech entrainment patterns in the conversations of neurotypical early adolescents. METHOD: This study utilized a corpus of 96 task-based conversations between adolescents between the ages of 9 and 14 years and a comparison corpus of 32 task-based conversations between adults. For each conversational turn, two speech entrainment scores were calculated for 429 acoustic features across rhythmic, articulatory, and phonatory dimensions. Predictive modeling was used to evaluate the degree of entrainment and relationship between entrainment and two metrics of conversational success. RESULTS: Speech entrainment increased throughout early adolescence but did not reach the level exhibited in conversations between adults. Additionally, speech entrainment was predictive of both conversational quality and conversational efficiency. Furthermore, models that included all acoustic features and both entrainment types performed better than models that only included individual acoustic feature sets or one type of entrainment. CONCLUSIONS: Our findings show that speech entrainment skills are largely developed during early adolescence with continued development possibly occurring across later adolescence. Additionally, results highlight the role of speech entrainment in successful conversation in this population, suggesting the import of continued exploration of this phenomenon in both neurotypical and neurodivergent adolescents. We also provide evidence of the value of using holistic measures that capture the multidimensionality of speech entrainment and provide a validated methodology for investigating entrainment across multiple acoustic features and entrainment types.


Assuntos
Comunicação , Fala , Adulto , Humanos , Adolescente , Criança , Fonação , Medida da Produção da Fala , Acústica
9.
Schizophr Bull ; 49(Suppl_2): S183-S195, 2023 03 22.
Artigo em Inglês | MEDLINE | ID: mdl-36946533

RESUMO

BACKGROUND AND HYPOTHESIS: Automated language analysis is becoming an increasingly popular tool in clinical research involving individuals with mental health disorders. Previous work has largely focused on using high-dimensional language features to develop diagnostic and prognostic models, but less work has been done to use linguistic output to assess downstream functional outcomes, which is critically important for clinical care. In this work, we study the relationship between automated language composites and clinical variables that characterize mental health status and functional competency using predictive modeling. STUDY DESIGN: Conversational transcripts were collected from a social skills assessment of individuals with schizophrenia (n = 141), bipolar disorder (n = 140), and healthy controls (n = 22). A set of composite language features based on a theoretical framework of speech production were extracted from each transcript and predictive models were trained. The prediction targets included clinical variables for assessment of mental health status and social and functional competency. All models were validated on a held-out test sample not accessible to the model designer. STUDY RESULTS: Our models predicted the neurocognitive composite with Pearson correlation PCC = 0.674; PANSS-positive with PCC = 0.509; PANSS-negative with PCC = 0.767; social skills composite with PCC = 0.785; functional competency composite with PCC = 0.616. Language features related to volition, affect, semantic coherence, appropriateness of response, and lexical diversity were useful for prediction of clinical variables. CONCLUSIONS: Language samples provide useful information for the prediction of a variety of clinical variables that characterize mental health status and functional competency.


Assuntos
Transtorno Bipolar , Esquizofrenia , Humanos , Esquizofrenia/diagnóstico , Fala , Comunicação , Nível de Saúde
10.
JASA Express Lett ; 3(1): 015201, 2023 01.
Artigo em Inglês | MEDLINE | ID: mdl-36725533

RESUMO

Studies have shown deep neural networks (DNN) as a potential tool for classifying dysarthric speakers and controls. However, representations used to train DNNs are largely not clinically interpretable, which limits clinical value. Here, a model with a bottleneck layer is trained to jointly learn a classification label and four clinically-interpretable features. Evaluation of two dysarthria subtypes shows that the proposed method can flexibly trade-off between improved classification accuracy and discovery of clinically-interpretable deficit patterns. The analysis using Shapley additive explanation shows the model learns a representation consistent with the disturbances that define the two dysarthria subtypes considered in this work.


Assuntos
Aprendizado Profundo , Disartria , Humanos , Disartria/diagnóstico , Redes Neurais de Computação
11.
PLoS One ; 18(2): e0281306, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-36800358

RESUMO

The DIVA model is a computational model of speech motor control that combines a simulation of the brain regions responsible for speech production with a model of the human vocal tract. The model is currently implemented in Matlab Simulink; however, this is less than ideal as most of the development in speech technology research is done in Python. This means there is a wealth of machine learning tools which are freely available in the Python ecosystem that cannot be easily integrated with DIVA. We present TorchDIVA, a full rebuild of DIVA in Python using PyTorch tensors. DIVA source code was directly translated from Matlab to Python, and built-in Simulink signal blocks were implemented from scratch. After implementation, the accuracy of each module was evaluated via systematic block-by-block validation. The TorchDIVA model is shown to produce outputs that closely match those of the original DIVA model, with a negligible difference between the two. We additionally present an example of the extensibility of TorchDIVA as a research platform. Speech quality enhancement in TorchDIVA is achieved through an integration with an existing PyTorch generative vocoder called DiffWave. A modified DiffWave mel-spectrum upsampler was trained on human speech waveforms and conditioned on the TorchDIVA speech production. The results indicate improved speech quality metrics in the DiffWave-enhanced output as compared to the baseline. This enhancement would have been difficult or impossible to accomplish in the original Matlab implementation. This proof-of-concept demonstrates the value TorchDIVA can bring to the research community. Researchers can download the new implementation at: https://github.com/skinahan/DIVA_PyTorch.


Assuntos
Ecossistema , Fala , Humanos , Software , Simulação por Computador , Aprendizado de Máquina
12.
Artigo em Inglês | MEDLINE | ID: mdl-36712557

RESUMO

Spectro-temporal dynamics of consonant-vowel (CV) transition regions are considered to provide robust cues related to articulation. In this work, we propose an objective measure of precise articulation, dubbed the objective articulation measure (OAM), by analyzing the CV transitions segmented around vowel onsets. The OAM is derived based on the posteriors of a convolutional neural network pre-trained to classify between different consonants using CV regions as input. We demonstrate that the OAM is correlated with perceptual measures in a variety of contexts including (a) adult dysarthric speech, (b) the speech of children with cleft lip/palate, and (c) a database of accented English speech from native Mandarin and Spanish speakers.

13.
Alzheimers Dement (Amst) ; 14(1): e12294, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-35229018

RESUMO

We developed and evaluated an automatically extracted measure of cognition (semantic relevance) using automated and manual transcripts of audio recordings from healthy and cognitively impaired participants describing the Cookie Theft picture from the Boston Diagnostic Aphasia Examination. We describe the rationale and metric validation. We developed the measure on one dataset and evaluated it on a large database (>2000 samples) by comparing accuracy against a manually calculated metric and evaluating its clinical relevance. The fully automated measure was accurate (r = .84), had moderate to good reliability (intra-class correlation = .73), correlated with Mini-Mental State Examination and improved the fit in the context of other automatic language features (r = .65), and longitudinally declined with age and level of cognitive impairment. This study demonstrates the use of a rigorous analytical and clinical framework for validating automatic measures of speech, and applied it to a measure that is accurate and clinically relevant.

14.
Front Neurol ; 12: 795374, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-34956070

RESUMO

Clinical assessments often use complex picture description tasks to elicit natural speech patterns and magnify changes occurring in brain regions implicated in Alzheimer's disease and dementia. As The Cookie Theft picture description task is used in the largest Alzheimer's disease and dementia cohort studies available, we aimed to create algorithms that could characterize the visual narrative path a participant takes in describing what is happening in this image. We proposed spatio-semantic graphs, models based on graph theory that transform the participants' narratives into graphs that retain semantic order and encode the visuospatial information between content units in the image. The resulting graphs differ between Cognitively Impaired and Unimpaired participants in several important ways. Cognitively Impaired participants consistently scored higher on features that are heavily associated with symptoms of cognitive decline, including repetition, evidence of short-term memory lapses, and generally disorganized narrative descriptions, while Cognitively Unimpaired participants produced more efficient narrative paths. These results provide evidence that spatio-semantic graph analysis of these tasks can generate important insights into a participant's cognitive performance that cannot be generated from semantic analysis alone.

15.
NPJ Digit Med ; 4(1): 153, 2021 Oct 28.
Artigo em Inglês | MEDLINE | ID: mdl-34711924

RESUMO

Digital health data are multimodal and high-dimensional. A patient's health state can be characterized by a multitude of signals including medical imaging, clinical variables, genome sequencing, conversations between clinicians and patients, and continuous signals from wearables, among others. This high volume, personalized data stream aggregated over patients' lives has spurred interest in developing new artificial intelligence (AI) models for higher-precision diagnosis, prognosis, and tracking. While the promise of these algorithms is undeniable, their dissemination and adoption have been slow, owing partially to unpredictable AI model performance once deployed in the real world. We posit that one of the rate-limiting factors in developing algorithms that generalize to real-world scenarios is the very attribute that makes the data exciting-their high-dimensional nature. This paper considers how the large number of features in vast digital health data can challenge the development of robust AI models-a phenomenon known as "the curse of dimensionality" in statistical learning theory. We provide an overview of the curse of dimensionality in the context of digital health, demonstrate how it can negatively impact out-of-sample performance, and highlight important considerations for researchers and algorithm designers.

16.
Artigo em Inglês | MEDLINE | ID: mdl-34348537

RESUMO

In this study, we present and provide validation data for a tool that predicts forced vital capacity (FVC) from speech acoustics collected remotely via a mobile app without the need for any additional equipment (e.g. a spirometer). We trained a machine learning model on a sample of healthy participants and participants with amyotrophic lateral sclerosis (ALS) to learn a mapping from speech acoustics to FVC and used this model to predict FVC values in a new sample from a different study of participants with ALS. We further evaluated the cross-sectional accuracy of the model and its sensitivity to within-subject change in FVC. We found that the predicted and observed FVC values in the test sample had a correlation coefficient of .80 and mean absolute error between .54 L and .58 L (18.5% to 19.5%). In addition, we found that the model was able to detect longitudinal decline in FVC in the test sample, although to a lesser extent than the observed FVC values measured using a spirometer, and was highly repeatable (ICC = 0.92-0.94), although to a lesser extent than the actual FVC (ICC = .97). These results suggest that sustained phonation may be a useful surrogate for VC in both research and clinical environments.


Assuntos
Esclerose Lateral Amiotrófica , Estudos Transversais , Humanos , Acústica da Fala , Espirometria , Capacidade Vital
17.
J Headache Pain ; 22(1): 82, 2021 Jul 23.
Artigo em Inglês | MEDLINE | ID: mdl-34301180

RESUMO

BACKGROUND/OBJECTIVE: Changes in speech can be detected objectively before and during migraine attacks. The goal of this study was to interrogate whether speech changes can be detected in subjects with post-traumatic headache (PTH) attributed to mild traumatic brain injury (mTBI) and whether there are within-subject changes in speech during headaches compared to the headache-free state. METHODS: Using a series of speech elicitation tasks uploaded via a mobile application, PTH subjects and healthy controls (HC) provided speech samples once every 3 days, over a period of 12 weeks. The following speech parameters were assessed: vowel space area, vowel articulation precision, consonant articulation precision, average pitch, pitch variance, speaking rate and pause rate. Speech samples of subjects with PTH were compared to HC. To assess speech changes associated with PTH, speech samples of subjects during headache were compared to speech samples when subjects were headache-free. All analyses were conducted using a mixed-effect model design. RESULTS: Longitudinal speech samples were collected from nineteen subjects with PTH (mean age = 42.5, SD = 13.7) who were an average of 14 days (SD = 32.2) from their mTBI at the time of enrollment and thirty-one HC (mean age = 38.7, SD = 12.5). Regardless of headache presence or absence, PTH subjects had longer pause rates and reductions in vowel and consonant articulation precision relative to HC. On days when speech was collected during a headache, there were longer pause rates, slower sentence speaking rates and less precise consonant articulation compared to the speech production of HC. During headache, PTH subjects had slower speaking rates yet more precise vowel articulation compared to when they were headache-free. CONCLUSIONS: Compared to HC, subjects with acute PTH demonstrate altered speech as measured by objective features of speech production. For individuals with PTH, speech production may have been more effortful resulting in slower speaking rates and more precise vowel articulation during headache vs. when they were headache-free, suggesting that speech alterations were related to PTH and not solely due to the underlying mTBI.


Assuntos
Concussão Encefálica , Transtornos de Enxaqueca , Cefaleia Pós-Traumática , Adulto , Concussão Encefálica/complicações , Cefaleia , Humanos , Cefaleia Pós-Traumática/etiologia , Fala
18.
J Speech Lang Hear Res ; 64(6S): 2213-2222, 2021 06 18.
Artigo em Inglês | MEDLINE | ID: mdl-33705675

RESUMO

Purpose Acoustic measurement of speech sounds requires first segmenting the speech signal into relevant units (words, phones, etc.). Manual segmentation is cumbersome and time consuming. Forced-alignment algorithms automate this process by aligning a transcript and a speech sample. We compared the phoneme-level alignment performance of five available forced-alignment algorithms on a corpus of child speech. Our goal was to document aligner performance for child speech researchers. Method The child speech sample included 42 children between 3 and 6 years of age. The corpus was force-aligned using the Montreal Forced Aligner with and without speaker adaptive training, triphone alignment from the Kaldi speech recognition engine, the Prosodylab-Aligner, and the Penn Phonetics Lab Forced Aligner. The sample was also manually aligned to create gold-standard alignments. We evaluated alignment algorithms in terms of accuracy (whether the interval covers the midpoint of the manual alignment) and difference in phone-onset times between the automatic and manual intervals. Results The Montreal Forced Aligner with speaker adaptive training showed the highest accuracy and smallest timing differences. Vowels were consistently the most accurately aligned class of sounds across all the aligners, and alignment accuracy increased with age for fricative sounds across the aligners too. Conclusion The best-performing aligner fell just short of human-level reliability for forced alignment. Researchers can use forced alignment with child speech for certain classes of sounds (vowels, fricatives for older children), especially as part of a semi-automated workflow where alignments are later inspected for gross errors. Supplemental Material https://doi.org/10.23641/asha.14167058.


Assuntos
Fonética , Fala , Adolescente , Algoritmos , Criança , Humanos , Reprodutibilidade dos Testes , Medida da Produção da Fala
19.
IEEE Trans Biomed Eng ; 68(10): 2986-2996, 2021 10.
Artigo em Inglês | MEDLINE | ID: mdl-33566756

RESUMO

OBJECTIVES: Evaluation of hypernasality requires extensive perceptual training by clinicians and extending this training on a large scale internationally is untenable; this compounds the health disparities that already exist among children with cleft. In this work, we present the objective hypernasality measure (OHM), a speech-based algorithm that automatically measures hypernasality in speech, and validate it relative to a group of trained clinicians. METHODS: We trained a deep neural network (DNN) on approximately 100 hours of a publicly-available healthy speech corpus to detect the presence of nasal acoustic cues generated through the production of nasal consonants and nasalized phonemes in speech. Importantly, this model does not require any clinical data for training. The posterior probabilities of the deep learning model were aggregated at the sentence and speaker-levels to compute the OHM. RESULTS: The results showed that the OHM was significantly correlated with perceptual hypernasality ratings from the Americleft database (r = 0.797, p < 0.001) and the New Mexico Cleft Palate Center (NMCPC) database (r = 0.713, p < 0.001). In addition, we evaluated the relationship between the OHM and articulation errors; the sensitivity of the OHM in detecting the presence of very mild hypernasality; and established the internal reliability of the metric. Further, the performance of the OHM was compared with a DNN regression algorithm directly trained on the hypernasal speech samples. SIGNIFICANCE: The results indicate that the OHM is able to measure the severity of hypernasality on par with Americleft-trained clinicians on thisdataset.


Assuntos
Fissura Palatina , Aprendizado Profundo , Distúrbios da Voz , Criança , Fissura Palatina/diagnóstico , Humanos , Reprodutibilidade dos Testes , Medida da Produção da Fala
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...