Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 30
Filtrar
1.
Sci Rep ; 13(1): 11106, 2023 07 10.
Artigo em Inglês | MEDLINE | ID: mdl-37429871

RESUMO

Acoustic identification of vocalizing individuals opens up new and deeper insights into animal communications, such as individual-/group-specific dialects, turn-taking events, and dialogs. However, establishing an association between an individual animal and its emitted signal is usually non-trivial, especially for animals underwater. Consequently, a collection of marine species-, array-, and position-specific ground truth localization data is extremely challenging, which strongly limits possibilities to evaluate localization methods beforehand or at all. This study presents ORCA-SPY, a fully-automated sound source simulation, classification and localization framework for passive killer whale (Orcinus orca) acoustic monitoring that is embedded into PAMGuard, a widely used bioacoustic software toolkit. ORCA-SPY enables array- and position-specific multichannel audio stream generation to simulate real-world ground truth killer whale localization data and provides a hybrid sound source identification approach integrating ANIMAL-SPOT, a state-of-the-art deep learning-based orca detection network, followed by downstream Time-Difference-Of-Arrival localization. ORCA-SPY was evaluated on simulated multichannel underwater audio streams including various killer whale vocalization events within a large-scale experimental setup benefiting from previous real-world fieldwork experience. Across all 58,320 embedded vocalizing killer whale events, subject to various hydrophone array geometries, call types, distances, and noise conditions responsible for a signal-to-noise ratio varying from [Formula: see text] dB to 3 dB, a detection rate of 94.0 % was achieved with an average localization error of 7.01[Formula: see text]. ORCA-SPY was field-tested on Lake Stechlin in Brandenburg Germany under laboratory conditions with a focus on localization. During the field test, 3889 localization events were observed with an average error of 29.19[Formula: see text] and a median error of 17.54[Formula: see text]. ORCA-SPY was deployed successfully during the DeepAL fieldwork 2022 expedition (DLFW22) in Northern British Columbia, with a mean average error of 20.01[Formula: see text] and a median error of 11.01[Formula: see text] across 503 localization events. ORCA-SPY is an open-source and publicly available software framework, which can be adapted to various recording conditions as well as animal species.


Assuntos
Aprendizado Profundo , Orca , Animais , Som , Simulação por Computador , Software
2.
Sci Rep ; 12(1): 21966, 2022 12 19.
Artigo em Inglês | MEDLINE | ID: mdl-36535999

RESUMO

Bioacoustic research spans a wide range of biological questions and applications, relying on identification of target species or smaller acoustic units, such as distinct call types. However, manually identifying the signal of interest is time-intensive, error-prone, and becomes unfeasible with large data volumes. Therefore, machine-driven algorithms are increasingly applied to various bioacoustic signal identification challenges. Nevertheless, biologists still have major difficulties trying to transfer existing animal- and/or scenario-related machine learning approaches to their specific animal datasets and scientific questions. This study presents an animal-independent, open-source deep learning framework, along with a detailed user guide. Three signal identification tasks, commonly encountered in bioacoustics research, were investigated: (1) target signal vs. background noise detection, (2) species classification, and (3) call type categorization. ANIMAL-SPOT successfully segmented human-annotated target signals in data volumes representing 10 distinct animal species and 1 additional genus, resulting in a mean test accuracy of 97.9%, together with an average area under the ROC curve (AUC) of 95.9%, when predicting on unseen recordings. Moreover, an average segmentation accuracy and F1-score of 95.4% was achieved on the publicly available BirdVox-Full-Night data corpus. In addition, multi-class species and call type classification resulted in 96.6% and 92.7% accuracy on unseen test data, as well as 95.2% and 88.4% regarding previous animal-specific machine-based detection excerpts. Furthermore, an Unweighted Average Recall (UAR) of 89.3% outperformed the multi-species classification baseline system of the ComParE 2021 Primate Sub-Challenge. Besides animal independence, ANIMAL-SPOT does not rely on expert knowledge or special computing resources, thereby making deep-learning-based bioacoustic signal identification accessible to a broad audience.


Assuntos
Aprendizado Profundo , Animais , Humanos , Aprendizado de Máquina , Algoritmos , Acústica , Área Sob a Curva
3.
J Speech Lang Hear Res ; 65(12): 4623-4636, 2022 12 12.
Artigo em Inglês | MEDLINE | ID: mdl-36417788

RESUMO

PURPOSE: The aim of this study was to investigate the speech prosody of postlingually deaf cochlear implant (CI) users compared with control speakers without hearing or speech impairment. METHOD: Speech recordings of 74 CI users (37 males and 37 females) and 72 age-balanced control speakers (36 males and 36 females) are considered. All participants are German native speakers and read Der Nordwind und die Sonne (The North Wind and the Sun), a standard text in pathological speech analysis and phonetic transcriptions. Automatic acoustic analysis is performed considering pitch, loudness, and duration features, including speech rate and rhythm. RESULTS: In general, duration and rhythm features differ between CI users and control speakers. CI users read slower and have a lower voiced segment ratio compared with control speakers. A lower voiced ratio goes along with a prolongation of the voiced segments' duration in male and with a prolongation of pauses in female CI users. Rhythm features in CI users have higher variability in the duration of vowels and consonants than in control speakers. The use of bilateral CIs showed no advantages concerning speech prosody features in comparison to unilateral use of CI. CONCLUSIONS: Even after cochlear implantation and rehabilitation, the speech of postlingually deaf adults deviates from the speech of control speakers, which might be due to changed auditory feedback. We suggest considering changes in temporal aspects of speech in future rehabilitation strategies. SUPPLEMENTAL MATERIAL: https://doi.org/10.23641/asha.21579171.


Assuntos
Implante Coclear , Implantes Cocleares , Surdez , Percepção da Fala , Adulto , Masculino , Feminino , Humanos , Surdez/reabilitação , Audição , Acústica
4.
Sci Rep ; 11(1): 23480, 2021 12 06.
Artigo em Inglês | MEDLINE | ID: mdl-34873193

RESUMO

Biometric identification techniques such as photo-identification require an array of unique natural markings to identify individuals. From 1975 to present, Bigg's killer whales have been photo-identified along the west coast of North America, resulting in one of the largest and longest-running cetacean photo-identification datasets. However, data maintenance and analysis are extremely time and resource consuming. This study transfers the procedure of killer whale image identification into a fully automated, multi-stage, deep learning framework, entitled FIN-PRINT. It is composed of multiple sequentially ordered sub-components. FIN-PRINT is trained and evaluated on a dataset collected over an 8-year period (2011-2018) in the coastal waters off western North America, including 121,000 human-annotated identification images of Bigg's killer whales. At first, object detection is performed to identify unique killer whale markings, resulting in 94.4% recall, 94.1% precision, and 93.4% mean-average-precision (mAP). Second, all previously identified natural killer whale markings are extracted. The third step introduces a data enhancement mechanism by filtering between valid and invalid markings from previous processing levels, achieving 92.8% recall, 97.5%, precision, and 95.2% accuracy. The fourth and final step involves multi-class individual recognition. When evaluated on the network test set, it achieved an accuracy of 92.5% with 97.2% top-3 unweighted accuracy (TUA) for the 100 most commonly photo-identified killer whales. Additionally, the method achieved an accuracy of 84.5% and a TUA of 92.9% when applied to the entire 2018 image collection of the 100 most common killer whales. The source code of FIN-PRINT can be adapted to other species and will be publicly available.

5.
Mov Disord ; 36(12): 2862-2873, 2021 12.
Artigo em Inglês | MEDLINE | ID: mdl-34390508

RESUMO

BACKGROUND: Dysarthric symptoms in Parkinson's disease (PD) vary greatly across cohorts. Abundant research suggests that such heterogeneity could reflect subject-level and task-related cognitive factors. However, the interplay of these variables during motor speech remains underexplored, let alone by administering validated materials to carefully matched samples with varying cognitive profiles and combining automated tools with machine learning methods. OBJECTIVE: We aimed to identify which speech dimensions best identify patients with PD in cognitively heterogeneous, cognitively preserved, and cognitively impaired groups through tasks with low (reading) and high (retelling) processing demands. METHODS: We used support vector machines to analyze prosodic, articulatory, and phonemic identifiability features. Patient groups were compared with healthy control subjects and against each other in both tasks, using each measure separately and in combination. RESULTS: Relative to control subjects, patients in cognitively heterogeneous and cognitively preserved groups were best discriminated by combined dysarthric signs during reading (accuracy = 84% and 80.2%). Conversely, patients with cognitive impairment were maximally discriminated from control subjects when considering phonemic identifiability during retelling (accuracy = 86.9%). This same pattern maximally distinguished between cognitively spared and impaired patients (accuracy = 72.1%). Also, cognitive (executive) symptom severity was predicted by prosody in cognitively preserved patients and by phonemic identifiability in cognitively heterogeneous and impaired groups. No measure predicted overall motor dysfunction in any group. CONCLUSIONS: Predominant dysarthric symptoms appear to be best captured through undemanding tasks in cognitively heterogeneous and preserved cohorts and through cognitively loaded tasks in patients with cognitive impairment. Further applications of this framework could enhance dysarthria assessments in PD. © 2021 International Parkinson and Movement Disorder Society.


Assuntos
Disfunção Cognitiva , Doença de Parkinson , Cognição , Disartria/diagnóstico , Disartria/etiologia , Humanos , Aprendizado de Máquina , Fala
6.
Int J Lang Commun Disord ; 56(5): 892-906, 2021 09.
Artigo em Inglês | MEDLINE | ID: mdl-34227721

RESUMO

BACKGROUND: Imprecise articulation has a negative impact on speech intelligibility. Therefore, treatment of articulation is clinically relevant in patients with dysarthria. In order to be effective and according to the principles of motor learning, articulation therapy needs to be intensive, well organized, with adequate feedback and requires frequent practice. AIMS: The aims of this pilot study are (1) to evaluate the feasibility of a virtual articulation therapy (VAT) to guide patients with dysarthria through a boost articulation therapy (BArT) program; (2) to evaluate the acoustic models' performance used for automatic phonological error detection; and (3) to validate the system by end-users from their perspective. METHODS & PROCEDURES: The VAT provides an extensive and well-structured package of exercises with visual and auditory modelling and adequate feedback on the utterances. The tool incorporates automated methods to detect phonological errors, which are specifically designed to analyse Dutch speech production. A total of 14 subjects with dysarthria evaluated the acceptability, usability and user interaction with the VAT based on two completed therapy sessions using a self-designed questionnaire. OUTCOMES & RESULTS: In general, participants were positive about the new computer-based therapy approach. The algorithm performance for phonological error detection shows it to be accurate, which contributes to adequate feedback of utterance production. The results of the study indicate that the VAT has a user-friendly interface that can be used independently by patients with dysarthria who have sufficient cognitive, linguistic, motoric and sensory skills to benefit from speech therapy. Recommendations were given by the end-users to further optimize the program and to ensure user engagement. CONCLUSIONS & IMPLICATIONS: The initial implementation of an automatic BArT shows it to be feasible and well accepted by end-users. The tool is an appropriate solution to increase the frequency and intensity of articulation training that supports traditional methods. WHAT THIS PAPER ADDS: What is already known on the subject Behavioural interventions to improve articulation in patients with dysarthria demand intensive treatments, repetitive practice and feedback. However, the current treatments are mainly limited in time to the interactive sessions in the presence of speech-language pathology. Automatic systems addressing the needs of individuals with dysarthria are scarce. This study evaluates the feasibility of a VAT program and investigates its acceptability, usability and user interaction. What this paper adds to existing knowledge The computer-based speech therapy approach developed and applied in this study intends to support intensive articulation training of patients with dysarthria. The virtual speech therapy offers the possibility of an individualized and customized therapy programme, with an extensive database of exercises, visual and auditory models of the target utterances, and providing adequate feedback based on automatic acoustic analysis of speech. What are the potential or actual clinical implications of this work? The automatic BArT overcomes the limitation in time of face-to-face traditional speech therapy. It offers patients the opportunity to have access to speech therapy more intensively and frequently in their home environment.


Assuntos
Disartria , Inteligibilidade da Fala , Adulto , Disartria/psicologia , Humanos , Projetos Piloto , Medida da Produção da Fala/métodos , Fonoterapia/métodos
7.
Cortex ; 132: 191-205, 2020 11.
Artigo em Inglês | MEDLINE | ID: mdl-32992069

RESUMO

Embodied cognition research on Parkinson's disease (PD) points to disruptions of frontostriatal language functions as sensitive targets for clinical assessment. However, no existing approach has been tested for crosslinguistic validity, let alone by combining naturalistic tasks with machine-learning tools. To address these issues, we conducted the first classifier-based examination of morphological processing (a core frontostriatal function) in spontaneous monologues from PD patients across three typologically different languages. The study comprised 330 participants, encompassing speakers of Spanish (61 patients, 57 matched controls), German (88 patients, 88 matched controls), and Czech (20 patients, 16 matched controls). All subjects described the activities they perform during a regular day, and their monologues were automatically coded via morphological tagging, a computerized method that labels each word with a part-of-speech tag (e.g., noun, verb) and specific morphological tags (e.g., person, gender, number, tense). The ensuing data were subjected to machine-learning analyses to assess whether differential morphological patterns could classify between patients and controls and reflect the former's degree of motor impairment. Results showed robust classification rates, with over 80% of patients being discriminated from controls in each language separately. Moreover, the most discriminative morphological features were associated with the patients' motor compromise (as indicated by Pearson r correlations between predicted and collected motor impairment scores that ranged from moderate to moderate-to-strong across languages). Taken together, our results suggest that morphological patterning, an embodied frontostriatal domain, may be distinctively affected in PD across languages and even under ecological testing conditions.


Assuntos
Idioma , Doença de Parkinson , Cognição , Humanos , Aprendizado de Máquina , Fala
8.
Neurodegener Dis Manag ; 10(3): 137-157, 2020 06.
Artigo em Inglês | MEDLINE | ID: mdl-32571150

RESUMO

Aim: This paper introduces Apkinson, a mobile application for motor evaluation and monitoring of Parkinson's disease patients. Materials & methods: The App is based on previously reported methods, for instance, the evaluation of articulation and pronunciation in speech, regularity and freezing of gait in walking, and tapping accuracy in hand movement. Results: Preliminary experiments indicate that most of the measurements are suitable to discriminate patients and controls. Significance is evaluated through statistical tests. Conclusion: Although the reported results correspond to preliminary experiments, we think that Apkinson is a very useful App that can help patients, caregivers and clinicians, in performing a more accurate monitoring of the disease progression. Additionally, the mobile App can be a personal health assistant.


Assuntos
Aplicativos Móveis , Doença de Parkinson/fisiopatologia , Smartphone , Idoso , Idoso de 80 Anos ou mais , Feminino , Marcha , Humanos , Masculino , Pessoa de Meia-Idade , Movimento , Índice de Gravidade de Doença , Fala
10.
Sci Rep ; 9(1): 10997, 2019 07 29.
Artigo em Inglês | MEDLINE | ID: mdl-31358873

RESUMO

Large bioacoustic archives of wild animals are an important source to identify reappearing communication patterns, which can then be related to recurring behavioral patterns to advance the current understanding of intra-specific communication of non-human animals. A main challenge remains that most large-scale bioacoustic archives contain only a small percentage of animal vocalizations and a large amount of environmental noise, which makes it extremely difficult to manually retrieve sufficient vocalizations for further analysis - particularly important for species with advanced social systems and complex vocalizations. In this study deep neural networks were trained on 11,509 killer whale (Orcinus orca) signals and 34,848 noise segments. The resulting toolkit ORCA-SPOT was tested on a large-scale bioacoustic repository - the Orchive - comprising roughly 19,000 hours of killer whale underwater recordings. An automated segmentation of the entire Orchive recordings (about 2.2 years) took approximately 8 days. It achieved a time-based precision or positive-predictive-value (PPV) of 93.2% and an area-under-the-curve (AUC) of 0.9523. This approach enables an automated annotation procedure of large bioacoustics databases to extract killer whale sounds, which are essential for subsequent identification of significant communication patterns. The code will be publicly available in October 2019 to support the application of deep learning to bioaoucstic research. ORCA-SPOT can be adapted to other animal species.


Assuntos
Vocalização Animal , Orca/fisiologia , Acústica , Animais , Aprendizado Profundo , Feminino , Masculino , Redes Neurais de Computação , Som , Espectrografia do Som/métodos
11.
Annu Int Conf IEEE Eng Med Biol Soc ; 2019: 717-720, 2019 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-31945997

RESUMO

This study presents an approach to Parkinson's disease detection using vowels with sustained phonation and a ResNet architecture dedicated originally to image classification. We calculated spectrum of the audio recordings and used them as an image input to the ResNet architecture pre-trained using the ImageNet and SVD databases. To prevent overfitting the dataset was strongly augmented in the time domain. The Parkinson's dataset (from PC-GITA database) consists of 100 patients (50 were healthy / 50 were diagnosed with Parkinson's disease). Each patient was recorded 3 times. The obtained accuracy on the validation set is above 90% which is comparable to the current state-of-the-art methods. The results are promising because it turned out that features learned on natural images are able to transfer the knowledge to artificial images representing the spectrogram of the voice signal. What is more, we showed that it is possible to perform a successful detection of Parkinson's disease using only frequency-based features. A spectrogram enables visual representation of frequencies spectrum of a signal. It allows to follow the frequencies changes of a signal in time.


Assuntos
Doença de Parkinson , Voz , Aprendizado Profundo , Humanos , Redes Neurais de Computação
12.
IEEE J Biomed Health Inform ; 23(4): 1618-1630, 2019 07.
Artigo em Inglês | MEDLINE | ID: mdl-30137018

RESUMO

Parkinson's disease is a neurodegenerative disorder characterized by a variety of motor symptoms. Particularly, difficulties to start/stop movements have been observed in patients. From a technical/diagnostic point of view, these movement changes can be assessed by modeling the transitions between voiced and unvoiced segments in speech, the movement when the patient starts or stops a new stroke in handwriting, or the movement when the patient starts or stops the walking process. This study proposes a methodology to model such difficulties to start or to stop movements considering information from speech, handwriting, and gait. We used those transitions to train convolutional neural networks to classify patients and healthy subjects. The neurological state of the patients was also evaluated according to different stages of the disease (initial, intermediate, and advanced). In addition, we evaluated the robustness of the proposed approach when considering speech signals in three different languages: Spanish, German, and Czech. According to the results, the fusion of information from the three modalities is highly accurate to classify patients and healthy subjects, and it shows to be suitable to assess the neurological state of the patients in several stages of the disease. We also aimed to interpret the feature maps obtained from the deep learning architectures with respect to the presence or absence of the disease and the neurological state of the patients. As far as we know, this is one of the first works that considers multimodal information to assess Parkinson's disease following a deep learning approach.


Assuntos
Aprendizado Profundo , Doença de Parkinson/classificação , Processamento de Sinais Assistido por Computador , Idoso , Idoso de 80 Anos ou mais , Bases de Dados Factuais , Feminino , Marcha/fisiologia , Análise da Marcha , Escrita Manual , Humanos , Processamento de Imagem Assistida por Computador , Masculino , Pessoa de Meia-Idade , Doença de Parkinson/diagnóstico , Doença de Parkinson/fisiopatologia , Curva ROC , Fala/classificação
13.
J Speech Lang Hear Res ; 61(1): 1-24, 2018 01 22.
Artigo em Inglês | MEDLINE | ID: mdl-29222538

RESUMO

Purpose: The aim of the study was to address the reported inconsistencies in the relationship between objective acoustic measures and perceptual ratings of vocal quality. Method: This tutorial moves away from the more widely examined problems related to obtaining the perceptual ratings and the acoustic measures and centers in less scrutinized issues regarding the procedure to establish the correspondence. Expressions for the most common measure of association between perceptual and acoustic measures (Pearson's r) are derived using a multiple linear regression model. The particular case where the multiple linear regression involves only roughness and breathiness is discussed to illustrate the issues. Results: Most problems reported regarding inconsistent findings in the relationship between given acoustic measures and particular perceptual ratings could be linked to sample properties not directly related to the actual relationship. The influential sample properties are the collinearity between the regressors in the multiple linear regression and their relative variances. Recommendations on how to rule out this possible cause of inconsistency are given, varying in scope from data collection, reporting, manipulation, and results interpretation. Conclusions: The problems described can be extended to more general cases than the exemplified roughness and breathiness sample's coverage. Ruling out this possible cause of inconsistency would increase the validity of the results reported.


Assuntos
Medida da Produção da Fala/métodos , Qualidade da Voz , Percepção Auditiva , Humanos , Modelos Lineares , Acústica da Fala , Distúrbios da Voz/diagnóstico
14.
ORL J Otorhinolaryngol Relat Spec ; 79(5): 282-294, 2017.
Artigo em Inglês | MEDLINE | ID: mdl-29131113

RESUMO

PURPOSE: To assess whether postlingual onset and shorter duration of deafness before cochlear implant (CI) provision predict higher speech intelligibility results of CI users. METHODS: For an objective judgement of speech intelligibility, we used an automatic speech recognition system computing the word recognition rate (WR) of 50 adult CI users and 50 age-matched control individuals. All subjects were recorded reading a standardized text. Subjects were divided into three groups: pre- or perilingual deafness (A), both >2 years before implantation, postlingual deafness <2 years before implantation (B), or postlingual deafness >2 years before implantation (C). RESULTS: CI users with short duration of postlingual deafness (B) had a significantly higher WR (median 74%) than CI users with long duration of postlingual deafness (C; 68%, p < 0.001) or pre-/perilingual onset (A; 56%, p < 0.001). Compared to their control groups only CI users with short duration of postlingual deafness reached similar WR, others showed significantly lower WR. Other factors such as hearing loss onset, duration of CI use, or duration of amplified hearing showed no consistent influence on speech quality. CONCLUSIONS: The speech production quality of adult CI users shows dependencies on the onset and duration of deafness. These features need to be considered while planning rehabilitation.


Assuntos
Implante Coclear/métodos , Perda Auditiva/terapia , Inteligibilidade da Fala/fisiologia , Percepção da Fala/fisiologia , Adolescente , Adulto , Idoso , Idoso de 80 Anos ou mais , Implantes Cocleares , Feminino , Perda Auditiva/fisiopatologia , Humanos , Masculino , Pessoa de Meia-Idade , Fala , Medida da Produção da Fala/métodos , Fatores de Tempo , Adulto Jovem
15.
Brain Lang ; 162: 19-28, 2016 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-27501386

RESUMO

To assess the impact of Parkinson's disease (PD) on spontaneous discourse, we conducted computerized analyses of brief monologues produced by 51 patients and 50 controls. We explored differences in semantic fields (via latent semantic analysis), grammatical choices (using part-of-speech tagging), and word-level repetitions (with graph embedding tools). Although overall output was quantitatively similar between groups, patients relied less heavily on action-related concepts and used more subordinate structures. Also, a classification tool operating on grammatical patterns identified monologues as pertaining to patients or controls with 75% accuracy. Finally, while the incidence of dysfluent word repetitions was similar between groups, it allowed inferring the patients' level of motor impairment with 77% accuracy. Our results highlight the relevance of studying naturalistic discourse features to tap the integrity of neural (and, particularly, motor) networks, beyond the possibilities of standard token-level instruments.


Assuntos
Movimento , Doença de Parkinson/fisiopatologia , Fala/fisiologia , Estudos de Casos e Controles , Feminino , Humanos , Masculino , Pessoa de Meia-Idade , Destreza Motora , Rede Nervosa , Semântica
16.
Logoped Phoniatr Vocol ; 41(3): 106-16, 2016 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-26016644

RESUMO

Automatic voice assessment is often performed using sustained vowels. In contrast, speech analysis of read-out texts can be applied to voice and speech assessment. Automatic speech recognition and prosodic analysis were used to find regression formulae between automatic and perceptual assessment of four voice and four speech criteria. The regression was trained with 21 men and 62 women (average age 49.2 years) and tested with another set of 24 men and 49 women (48.3 years), all suffering from chronic hoarseness. They read the text 'Der Nordwind und die Sonne' ('The North Wind and the Sun'). Five voice and speech therapists evaluated the data on 5-point Likert scales. Ten prosodic and recognition accuracy measures (features) were identified which describe all the examined criteria. Inter-rater correlation within the expert group was between r = 0.63 for the criterion 'match of breath and sense units' and r = 0.87 for the overall voice quality. Human-machine correlation was between r = 0.40 for the match of breath and sense units and r = 0.82 for intelligibility. The perceptual ratings of different criteria were highly correlated with each other. Likewise, the feature sets modeling the criteria were very similar. The automatic method is suitable for assessing chronic hoarseness in general and for subgroups of functional and organic dysphonia. In its current version, it is almost as reliable as a randomly picked rater from a group of voice and speech therapists.


Assuntos
Rouquidão/diagnóstico , Reconhecimento Automatizado de Padrão , Processamento de Sinais Assistido por Computador , Acústica da Fala , Medida da Produção da Fala/métodos , Qualidade da Voz , Acústica , Adolescente , Adulto , Idoso , Idoso de 80 Anos ou mais , Doença Crônica , Feminino , Rouquidão/fisiopatologia , Humanos , Masculino , Pessoa de Meia-Idade , Valor Preditivo dos Testes , Leitura , Análise de Regressão , Reprodutibilidade dos Testes , Máquina de Vetores de Suporte , Adulto Jovem
17.
IEEE J Biomed Health Inform ; 19(6): 1820-8, 2015 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-26277012

RESUMO

This paper evaluates the accuracy of different characterization methods for the automatic detection of multiple speech disorders. The speech impairments considered include dysphonia in people with Parkinson's disease (PD), dysphonia diagnosed in patients with different laryngeal pathologies (LP), and hypernasality in children with cleft lip and palate (CLP). Four different methods are applied to analyze the voice signals including noise content measures, spectral-cepstral modeling, nonlinear features, and measurements to quantify the stability of the fundamental frequency. These measures are tested in six databases: three with recordings of PD patients, two with patients with LP, and one with children with CLP. The abnormal vibration of the vocal folds observed in PD patients and in people with LP is modeled using the stability measures with accuracies ranging from 81% to 99% depending on the pathology. The spectral-cepstral features are used in this paper to model the voice spectrum with special emphasis around the first two formants. These measures exhibit accuracies ranging from 95% to 99% in the automatic detection of hypernasal voices, which confirms the presence of changes in the speech spectrum due to hypernasality. Noise measures suitably discriminate between dysphonic and healthy voices in both databases with speakers suffering from LP. The results obtained in this study suggest that it is not suitable to use every kind of features to model all of the voice pathologies; conversely, it is necessary to study the physiology of each impairment to choose the most appropriate set of features.


Assuntos
Diagnóstico por Computador/métodos , Doenças da Laringe/diagnóstico , Processamento de Sinais Assistido por Computador , Espectrografia do Som/métodos , Distúrbios da Voz/diagnóstico , Adulto , Idoso , Idoso de 80 Anos ou mais , Feminino , Humanos , Doenças da Laringe/classificação , Doenças da Laringe/fisiopatologia , Masculino , Pessoa de Meia-Idade , Distúrbios da Voz/classificação , Distúrbios da Voz/fisiopatologia
18.
Comput Math Methods Med ; 2015: 316325, 2015.
Artigo em Inglês | MEDLINE | ID: mdl-26136813

RESUMO

Due to low intra- and interrater reliability, perceptual voice evaluation should be supported by objective, automatic methods. In this study, text-based, computer-aided prosodic analysis and measurements of connected speech were combined in order to model perceptual evaluation of the German Roughness-Breathiness-Hoarseness (RBH) scheme. 58 connected speech samples (43 women and 15 men; 48.7 ± 17.8 years) containing the German version of the text "The North Wind and the Sun" were evaluated perceptually by 19 speech and voice therapy students according to the RBH scale. For the human-machine correlation, Support Vector Regression with measurements of the vocal fold cycle irregularities (CFx) and the closed phases of vocal fold vibration (CQx) of the Laryngograph and 33 features from a prosodic analysis module were used to model the listeners' ratings. The best human-machine results for roughness were obtained from a combination of six prosodic features and CFx (r = 0.71, ρ = 0.57). These correlations were approximately the same as the interrater agreement among human raters (r = 0.65, ρ = 0.61). CQx was one of the substantial features of the hoarseness model. For hoarseness and breathiness, the human-machine agreement was substantially lower. Nevertheless, the automatic analysis method can serve as the basis for a meaningful objective support for perceptual analysis.


Assuntos
Rouquidão/diagnóstico , Processamento de Sinais Assistido por Computador , Espectrografia do Som/métodos , Fala , Distúrbios da Voz/diagnóstico , Qualidade da Voz , Adolescente , Adulto , Idoso , Idoso de 80 Anos ou mais , Criança , Feminino , Humanos , Masculino , Pessoa de Meia-Idade , Análise de Regressão , Reprodutibilidade dos Testes , Software , Percepção da Fala , Fonoterapia , Adulto Jovem
19.
Folia Phoniatr Logop ; 66(6): 219-26, 2014.
Artigo em Inglês | MEDLINE | ID: mdl-25659422

RESUMO

OBJECTIVE: Automatic intelligibility assessment using automatic speech recognition is usually language specific. In this study, a language-independent approach is proposed. It uses models that are trained with Flemish speech, and it is applied to assess chronically hoarse German speakers. The research questions are here: is it possible to construct suitable acoustic features that generalize to other languages and a speech disorder, and is the generated model for intelligibility also suitable for specific subtypes of that disorder, i.e. functional and organic dysphonia? PATIENTS AND METHODS: 73 German-speaking persons with chronic hoarseness read the text 'Der Nordwind und die Sonne'. Perceptual intelligibility scores were used as ground truth during the training of an automatic model that converts speaker level acoustic measurements into intelligibility scores. Cross-validation is used to assess model performance. RESULTS: The interrater agreement for all patients (n = 73) and for the functional and organic dysphonia subgroups (n = 45 and n = 24) are r = 0.82, r = 0.83 and r = 0.75, respectively. The automatic assessment based on phonologically based acoustic models revealed correlations between perceptual and automatic intelligibility ratings of r = 0.79 (all patients), r = 0.78 (functional dysphonia) and r = 0.80 (organic dysphonia). CONCLUSION: The automatic, objective measurement of intelligibility is a valuable instrument in an evidence-based clinical practice.


Assuntos
Rouquidão/diagnóstico , Rouquidão/psicologia , Idioma , Inteligibilidade da Fala , Interface para o Reconhecimento da Fala , Adulto , Idoso , Idoso de 80 Anos ou mais , Doença Crônica , Disfonia/diagnóstico , Feminino , Rouquidão/etiologia , Humanos , Masculino , Pessoa de Meia-Idade , Fonética , Acústica da Fala , Adulto Jovem
20.
Psychiatry Res ; 198(2): 321-3, 2012 Jul 30.
Artigo em Inglês | MEDLINE | ID: mdl-22417927

RESUMO

Visual attention allocation of adolescent girls with and without an eating disorder while viewing body images of underweight, normal-weight and overweight women was studied using eye tracking. While all girls attended more to specific body parts (e.g. hips, upper legs), eating-disordered girls showed an attentional bias towards unclothed body parts.


Assuntos
Atenção , Imagem Corporal/psicologia , Transtornos da Alimentação e da Ingestão de Alimentos/psicologia , Percepção Visual , Adolescente , Feminino , Humanos , Estimulação Luminosa/métodos
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...