Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 139
Filter
1.
Sci Rep ; 14(1): 9297, 2024 04 23.
Article in English | MEDLINE | ID: mdl-38654036

ABSTRACT

Voice change is often the first sign of laryngeal cancer, leading to diagnosis through hospital laryngoscopy. Screening for laryngeal cancer solely based on voice could enhance early detection. However, identifying voice indicators specific to laryngeal cancer is challenging, especially when differentiating it from other laryngeal ailments. This study presents an artificial intelligence model designed to distinguish between healthy voices, laryngeal cancer voices, and those of the other laryngeal conditions. We gathered voice samples of individuals with laryngeal cancer, vocal cord paralysis, benign mucosal diseases, and healthy participants. Comprehensive testing was conducted to determine the best mel-frequency cepstral coefficient conversion and machine learning techniques, with results analyzed in-depth. In our tests, laryngeal diseases distinguishing from healthy voices achieved an accuracy of 0.85-0.97. However, when multiclass classification, accuracy ranged from 0.75 to 0.83. These findings highlight the challenges of artificial intelligence-driven voice-based diagnosis due to overlaps with benign conditions but also underscore its potential.


Subject(s)
Artificial Intelligence , Laryngeal Diseases , Stroboscopy , Vocal Cords , Voice Quality , Adult , Aged , Humans , Male , Middle Aged , Case-Control Studies , Health , Laryngeal Diseases/classification , Laryngeal Diseases/diagnosis , Laryngeal Diseases/physiopathology , Laryngeal Neoplasms/diagnosis , Neural Networks, Computer , Squamous Cell Carcinoma of Head and Neck , Support Vector Machine , Vocal Cord Paralysis/diagnosis , Vocal Cords/pathology , Vocal Cords/physiopathology , Voice Disorders/classification , Voice Disorders/diagnosis , Voice Disorders/physiopathology
2.
JAMA Otolaryngol Head Neck Surg ; 148(2): 139-144, 2022 02 01.
Article in English | MEDLINE | ID: mdl-34854914

ABSTRACT

Importance: Prevalent schemes that have been used for arranging voice pathologies have shaped theoretical and clinical views and the conceptualization of the pathologies and of the field as a whole. However, these available schemes contain inconsistencies and categorical overlaps. Objective: To develop and evaluate a new approach for arranging voice pathologies, using 2 continuous scales, organicity and tonicity, which were used to construct a 2-dimensional plane. Design, Setting, and Participants: This survey study was conducted among experts in the fields of laryngology and/or voice disorders from 10 countries. The survey was conducted using an online platform from March to May 2021. The data were analyzed in June 2021. Of the 45 experts who were initially approached, 39 (86.7%) completed the survey. Main Outcomes and Measures: The primary outcome measures were group ratings on 2 rating scales: organicity and tonicity. On the organicity scale, 0 represented nonorganic and 10 organic. On the tonicity scale, 0 represented hypotonic and 10 hypertonic. Results: Participants included 16 laryngologists and 23 speech-language pathologists, of whom 27 (69.2%) were women and 12 (30.8%) men with a mean age of 55 years. The Cronbach α was high for organicity and tonicity (0.98 and 0.97, respectively). Interrater agreement (rwg) was moderate to very strong (rwg≥0.50) for most pathologies. The correlation between the 2 scales was moderate and negative (r = -0.38; P = .03). The pathologies were scattered across the full range of both scales and the 4 quadrants of the 2-dimensional plane, suggesting the continuity and bidimensionality of the new arrangement scheme. In addition, a latent profile analysis suggested that the 4-cluster solution is valid and roughly corresponded to the 4 quadrants of the constructed plane. Conclusions and Relevance: The findings of this survey study suggest the potential use of a 2-dimensional plane that was based on 2 continuous scales as a new arrangement scheme for voice disorders. The results suggest that this approach provides a valid representation of the field based on 2 basic measures beyond the specific etiology of each laryngeal pathology or condition. This simple and comprehensive organization scheme has the potential to facilitate new insights on the nature of voice pathologies, considering the interpathology similarities and differences.


Subject(s)
Voice Disorders/classification , Voice Disorders/physiopathology , Voice Quality , Female , Humans , Male , Middle Aged , Surveys and Questionnaires
3.
Audiol., Commun. res ; 27: e2602, 2022. graf
Article in Portuguese | LILACS | ID: biblio-1374481

ABSTRACT

RESUMO Objetivo Identificar os termos referidos pela população em geral para a qualidade vocal saudável, rugosa e soprosa. Métodos foi realizado um teste, de modo presencial, com 50 participantes sem vínculos acadêmicos ou profissionais com a Fonoaudiologia. A tarefa consistia em ouvir três vozes e defini-las livremente. A primeira voz apresentada era predominantemente soprosa; a segunda, predominantemente rugosa e a terceira, vocalmente saudável. Apresentou-se a emissão sustentada da vogal /Ɛ/ e a contagem de 1 a 10. Cada participante deveria responder ao comando: "Ouça essa voz. Com qual termo você a nomearia?", digitando a resposta em uma linha disposta na tela do PowerPoint. Resultados para a voz saudável, o termo que mais se repetiu foi "normal" (36%); outros termos foram: "limpa", "comum", "padrão", "clara", "límpida", "firme", "boa", "som aberto", "definida". Para a voz rugosa, 25 participantes (50%) responderam com o termo "rouca" e os demais se dividiram em termos como "ruidosa", "chiada", "voz de fumante", "grave", "idosa", "cavernosa", "anormal", entre outros termos similares. Para a voz soprosa, 24 participantes (48%) usaram o termo "cansada"; cinco atribuíram o adjetivo "fraca"; três responderam com o termo "sem fôlego"; houve duas correspondências aos termos "arrastada" e "doente" e os demais participantes responderam com termos semelhantes: "exausta", "preguiçosa", "sonolenta", "fatigada" e afins. Conclusão os termos "normal" para voz saudável, "rouca" para voz rugosa e "cansada" para voz soprosa possibilitam a percepção mais usual desses parâmetros clínicos de qualidade vocal, para indivíduos alheios à linguagem técnico-científica da Fonoaudiologia


ABSTRACT Purpose Identify the terms mentioned by the general population for healthy, rough and breathy vocal quality. Methods A test was carried out with 50 participants, in person, without academic or professional ties with Speech Therapy. The task was to hear three voices and define them freely. The first voice presented was predominantly breathy; the second, predominantly rough and the third, vocally healthy. The sustained emission of the vowel / Ɛ / and the count from one to ten were presented. Each participant should respond to the command: "Listen to that voice. Which term would you name it?", Typing the answer on a line displayed on the PowerPoint screen. Results For the healthy voice, the term that was repeated the most was "normal" (36%), other terms were: "clean", "common", "standard", "clear", "clear", "firm", "good", "open sound", "defined". For the rough voice, twenty-five participants (50%) responded with the term "hoarse" and the others were divided into terms such as "noisy", "smoker's voice", "deep", "elderly", "cavernous", "abnormal", among other similar terms. For the breathy voice, twenty-four participants (48%) used the term "tired"; five participants assigned the adjective "weak"; three responded with the term "out of breath"; there were two correspondences to the terms "dragged" and "sick"; and the other participants responded with terms similar: "exhausted", "lazy", "sleepy", "fatigued" and the like. Conclusion The terms "normal" for a healthy voice, "hoarse" for a rough voice and "tired" for a breathy voice, allow a more usual perception of these clinical parameters of vocal quality, for individuals outside the technical-scientific language of Speech Therapy.


Subject(s)
Humans , Male , Female , Adult , Auditory Perception , Voice Quality , Voice Disorders/classification , Dysphonia , Hoarseness
4.
Aerosp Med Hum Perform ; 91(6): 471-478, 2020 Jun 01.
Article in English | MEDLINE | ID: mdl-32408930

ABSTRACT

BACKGROUND: Although the understanding of hypobaric hypoxia is increasing, it remains a hazard in aviation medicine. This study examined the feasibility of detecting voice markers sensitive to acute hypobaric hypoxia in an early presymptomatic (PRE-SYMP) stage.METHOD: Eight subjects qualified with hypobaric training completed a series of standardized speech tests in a hypobaric chamber at 20,000 ft and 25,000 ft (6096 and 7620 m) of altitude. Voice response patterns were analyzed in terms of fundamental frequency (F0), F0 range, and voice onset time (VOT). We hypothesized a PRE-SYMP compensatory stage in voice reactivity.RESULTS: There was a different dose-response reactivity course at 20,000 ft vs. 25,000 ft, nonlinear to altitude. At 20,000 ft, our hypothesis was confirmed. In comparison to sea level, a PRE-SYMP compensatory stage could be distinguished, characterized by a decreased F0 range, decreased VOT, and increased F0. During a transitional (TRANS) stage, in comparison with sea level, the F0-range reset, VOT decreased, and F0 increased. During a symptomatic (SYMP) stage, F0 increased, F0 range increased, and VOT decreased. At 25,000 ft, in comparison to sea level, voice reactivity showed increased F0 and F0 range and decreased VOT in a PRE-SYMP stage and increased F0 and F0 range in the SYMP stage.DISCUSSION: The compensatory PRE-SYMP stage is suggested to be the expression of ongoing bottom-up and top-down regulatory mechanisms, whereas the 25,000-ft results are interpreted as a combination of tonic and phasic voice reactivity. This tonic component needs to be foreseen in sea level baseline measures.Van Puyvelde M, Neyt X, Vanderlinden W, Van den Bossche M, Bucovaz T, De Winne T, Pattyn N. Voice reactivity as a response to acute hypobaric hypoxia at high altitude. Aerosp Med Hum Perform. 2020; 91(6):471-478.


Subject(s)
Hypoxia , Voice Disorders , Voice/physiology , Adult , Aerospace Medicine , Altitude , Humans , Hypoxia/classification , Hypoxia/diagnosis , Hypoxia/physiopathology , Male , Speech Production Measurement , Voice Disorders/classification , Voice Disorders/diagnosis , Voice Disorders/physiopathology
5.
Codas ; 32(2): e20180141, 2020.
Article in Portuguese, English | MEDLINE | ID: mdl-32049096

ABSTRACT

PURPOSE: Describe the self-referred personal behavior profiles of university professors and verify the association of these profiles with the self-assessment of communicative aspects and vocal symptoms. METHODS: Study conducted with 334 professors at a public university who responded to an online questionnaire regarding voice use in teaching practice. Personal behavior profile classification was the response variable, which was divided into four types: pragmatic, analytical, expressive and affable. Explanatory variables were vocal self-perception, vocal resources, and communicative aspects. Descriptive data analysis was performed with application of the Pearson's Chi-squared and Fisher's Exact tests. RESULTS: University professors identified themselves more with the affable and expressive personal behavior profiles. Overall, professors presented good self-perception about vocal and communicative aspects, in addition to having reported few vocal symptoms. Profiles differed for some of the assessed variables, namely, pragmatic professors reported high speech velocity and sporadic eye contact; expressive professors demonstrated self-perception about their voice and strong voice intensity; those in the analytical profile self-reported negative perception about vocal quality, weak voice intensity, poor articulation and rapid speaking rate; the other professors mostly reported voice tiredness symptoms and difficulty projecting the voice. CONCLUSION: University professors identify themselves mostly with the affable and expressive profiles. Self-perception analysis of the personal behavior profile in university professors showed the influence of self-reported personality characteristics on communicative skills in the classroom.


OBJETIVO: Descrever o perfil de comportamento pessoal autorreferido por professores universitários, e verificar a associação destes perfis com a autoavaliação dos aspectos comunicativos e sintomas vocais. MÉTODO: Estudo realizado com 334 professores de uma universidade pública que responderam um questionário online referente ao uso da voz na docência. A variável resposta foi a classificação do perfil de comportamento pessoal, identificado em quatro tipos: pragmático, analítico, expressivo e afável, e as variáveis explicativas foram: autopercepção vocal, recursos vocais e aspectos comunicativos. Foi realizada a análise descritiva dos dados, além dos testes Quiquadrado de Pearson e Exato de Fisher. RESULTADOS: Os professores universitários se identificaram mais com os perfis de comportamento pessoal afável e expressivo. De forma geral, os docentes demonstraram boa autopercepção dos aspectos vocais e comunicativos, além de terem relatado poucos sintomas vocais. Os perfis se diferenciaram em algumas variáveis estudadas: o pragmático relatou velocidade de fala rápida e, às vezes, realizar contato de olhos; o expressivo demonstrou autopercepção positiva de sua voz e intensidade forte. Professores com perfil analítico autorreferiram percepção negativa da qualidade vocal, intensidade fraca, articulação ruim e velocidade de fala rápida e, entre os demais perfis, foi o que mais relatou sintomas de cansaço na voz e dificuldade para projetar a voz. CONCLUSÃO: Professores universitários se identificam predominantemente com os perfis afável e expressivo. A análise da autopercepção do perfil de comportamento pessoal em professores universitários mostra a influência das características da personalidade autorreferidas sobre as habilidades comunicativas em sala de aula.


Subject(s)
Self Concept , Self-Assessment , Speech Production Measurement/psychology , Voice Quality/physiology , Cross-Sectional Studies , Faculty , Female , Humans , Male , Middle Aged , Speech Acoustics , Surveys and Questionnaires , Verbal Behavior/physiology , Voice Disorders/classification , Voice Disorders/diagnosis , Voice Disorders/psychology
6.
Article in English | MEDLINE | ID: mdl-32015933

ABSTRACT

Background: The consensus statement by the Task Force on Tremor of the International Parkinson and Movement Disorder Society excludes individuals with "isolated voice tremor" as a clinical variant of essential tremor (ET). This clinical viewpoint presents a rationale for reconsideration of "isolated voice tremor" as a clinical variant of ET. Methods: Evidence from the literature was extracted to characterize the clinical phenotype of "isolated voice tremor," or essential vocal tremor (EVT). Clinical features were extracted from relevant literature available at pubmed.gov using the terms "EVT," "essential voice tremor," "primary voice tremor," and "organic voice tremor." Results: The average age of onset in those with EVT was older than 60 years (range 19-84 years), with 75-93% being female. The typical duration of vocal tremor ranged from 1 to 13 years (average 6 years). The distribution of structures exhibiting tremor included the larynx, soft palate, pharynx, and base of tongue in the majority of patients, with some exhibiting tremor of the head and respiratory musculature. The condition of tremor occurred during speech and quiet respiration in 74% of individuals. Rate of tremor ranged from 4 to 10 Hz. Nearly 70% reported onset of vocal tremor prior to upper limb involvement. Family history of tremor was reported in 38-42% of individuals. Discussion: Those previously classified with EVT demonstrate a similar familial history, rate, tremor classification, and body distribution of ET. EVT is proposed as a clinical variant of ET in the pattern of onset and progression of body distribution from the midline cranial to spinal neural pathways.


Subject(s)
Essential Tremor/physiopathology , Voice Disorders/physiopathology , Age Distribution , Essential Tremor/classification , Essential Tremor/epidemiology , Humans , Sex Distribution , Tremor/classification , Tremor/epidemiology , Tremor/physiopathology , Voice Disorders/classification , Voice Disorders/epidemiology
7.
CoDAS ; 32(2): e20180141, 2020. tab
Article in Portuguese | LILACS | ID: biblio-1055901

ABSTRACT

RESUMO Objetivo Descrever o perfil de comportamento pessoal autorreferido por professores universitários, e verificar a associação destes perfis com a autoavaliação dos aspectos comunicativos e sintomas vocais. Método Estudo realizado com 334 professores de uma universidade pública que responderam um questionário online referente ao uso da voz na docência. A variável resposta foi a classificação do perfil de comportamento pessoal, identificado em quatro tipos: pragmático, analítico, expressivo e afável, e as variáveis explicativas foram: autopercepção vocal, recursos vocais e aspectos comunicativos. Foi realizada a análise descritiva dos dados, além dos testes Quiquadrado de Pearson e Exato de Fisher. Resultados Os professores universitários se identificaram mais com os perfis de comportamento pessoal afável e expressivo. De forma geral, os docentes demonstraram boa autopercepção dos aspectos vocais e comunicativos, além de terem relatado poucos sintomas vocais. Os perfis se diferenciaram em algumas variáveis estudadas: o pragmático relatou velocidade de fala rápida e, às vezes, realizar contato de olhos; o expressivo demonstrou autopercepção positiva de sua voz e intensidade forte. Professores com perfil analítico autorreferiram percepção negativa da qualidade vocal, intensidade fraca, articulação ruim e velocidade de fala rápida e, entre os demais perfis, foi o que mais relatou sintomas de cansaço na voz e dificuldade para projetar a voz. Conclusão Professores universitários se identificam predominantemente com os perfis afável e expressivo. A análise da autopercepção do perfil de comportamento pessoal em professores universitários mostra a influência das características da personalidade autorreferidas sobre as habilidades comunicativas em sala de aula.


ABSTRACT Purpose Describe the self-referred personal behavior profiles of university professors and verify the association of these profiles with the self-assessment of communicative aspects and vocal symptoms. Methods Study conducted with 334 professors at a public university who responded to an online questionnaire regarding voice use in teaching practice. Personal behavior profile classification was the response variable, which was divided into four types: pragmatic, analytical, expressive and affable. Explanatory variables were vocal self-perception, vocal resources, and communicative aspects. Descriptive data analysis was performed with application of the Pearson's Chi-squared and Fisher's Exact tests. Results University professors identified themselves more with the affable and expressive personal behavior profiles. Overall, professors presented good self-perception about vocal and communicative aspects, in addition to having reported few vocal symptoms. Profiles differed for some of the assessed variables, namely, pragmatic professors reported high speech velocity and sporadic eye contact; expressive professors demonstrated self-perception about their voice and strong voice intensity; those in the analytical profile self-reported negative perception about vocal quality, weak voice intensity, poor articulation and rapid speaking rate; the other professors mostly reported voice tiredness symptoms and difficulty projecting the voice. Conclusion University professors identify themselves mostly with the affable and expressive profiles. Self-perception analysis of the personal behavior profile in university professors showed the influence of self-reported personality characteristics on communicative skills in the classroom.


Subject(s)
Humans , Male , Female , Self-Assessment , Self Concept , Speech Production Measurement/psychology , Speech Acoustics , Verbal Behavior/physiology , Voice Quality/physiology , Voice Disorders/classification , Voice Disorders/diagnosis , Voice Disorders/psychology , Cross-Sectional Studies , Surveys and Questionnaires , Faculty , Middle Aged
8.
Ann Otol Rhinol Laryngol ; 128(10): 921-931, 2019 Oct.
Article in English | MEDLINE | ID: mdl-31084359

ABSTRACT

PURPOSE: Signal typing has been used to categorize healthy and disordered voices; however, human voices are likely comprised of differing proportions of periodic type 1 elements, type 2 elements that are periodic with modulations, aperiodic type 3 elements, and stochastic type 4 elements. A novel diffusive chaos method is presented to detect the distribution of voice types within a signal with the goal of providing an objective and clinically useful tool for evaluating the voice. It was predicted that continuous calculation of the diffusive chaos parameter throughout the voice sample would allow for construction of comprehensive voice type component profiles (VTCP). METHODS: One hundred thirty-five voice samples of sustained /a/ vowels were randomly selected from the Disordered Voice Database Model 4337. All samples were classified according to the voice type paradigm using spectrogram analysis, yielding 34 type 1, 35 type 2, 42 type 3, and 24 type 4 voice samples. All samples were then analyzed using the diffusive chaos method, and VTCPs were generated to show the distribution of the 4 voice type components (VTC). RESULTS: The proportions of VTC1 varied significantly between the majority of the traditional voice types (P < .001). Three of the 4 VTCs of type 3 voices were significantly different from the VTCs of type 4 voices (P < .001). These results were compared to calculations of spectrum convergence ratio, which did not vary significantly between voice types 1 and 2 or 2 and 3. CONCLUSION: The diffusive chaos method demonstrates proficiency in generating comprehensive VTCPs for disordered voices with varying severity. In contrast to acoustic parameters that provide a single measure of disorder, VTCPs can be used to detect subtler changes by observing variations in each VTC over time. This method also provides the advantage of quantifying stochastic noise components that are due to breathiness in the voice.


Subject(s)
Phonation , Speech Production Measurement/methods , Voice Disorders/diagnosis , Voice Quality , Adult , Aged , Aged, 80 and over , Female , Humans , Male , Middle Aged , Sound Spectrography , Voice Disorders/classification , Young Adult
9.
Neurodegener Dis ; 19(5-6): 163-170, 2019.
Article in English | MEDLINE | ID: mdl-32126556

ABSTRACT

BACKGROUND: Amyotrophic lateral sclerosis (ALS) is a fatal progressive motor neuron disease. People with ALS demonstrate various speech problems. SUMMARY: We aim to provide an overview of studies concerning the diagnosis of ALS based on the analysis of voice samples. The main focus is on the feasibility of the use of voice and speech assessment as an effective method to diagnose the disease, either in clinical or pre-clinical conditions, and to monitor the disease progression. Specifically, we aim to examine current knowledge on: (a) voice parameters and the data models that can, most effectively, provide robust results; (b) the feasibility of a semi-automatic or automatic diagnosis and outcomes; and (c) the factors that can improve or restrict the use of such systems in a real-world context. Key Messages: The studies already carried out on the possibility of diagnosis of ALS using the voice signal are still sparse but all point to the importance, feasibility and simplicity of this approach. Most cohorts are small which limits the statistical relevance and makes it difficult to infer broader conclusions. The set of features used, although diverse, is quite circumscribed. ALS is difficult to diagnose early because it may mimic several other neurological diseases. Promising results were found for the automatic detection of ALS from speech samples and this can be a feasible process even in pre-symptomatic stages. Improved guidelines must be set in order to establish a robust decision model.


Subject(s)
Amyotrophic Lateral Sclerosis/classification , Diagnosis, Computer-Assisted , Voice , Amyotrophic Lateral Sclerosis/complications , Amyotrophic Lateral Sclerosis/diagnosis , Diagnosis, Computer-Assisted/methods , Humans , Pattern Recognition, Automated , Speech Recognition Software , Voice Disorders/classification , Voice Disorders/diagnosis , Voice Disorders/etiology
10.
Folia Phoniatr Logop ; 70(3-4): 174-182, 2018.
Article in English | MEDLINE | ID: mdl-30184538

ABSTRACT

BACKGROUND: Studies have used questionnaires of dysphonic symptoms to screen voice disorders. This study investigated whether the differential presentation of demographic and symptomatic features can be applied to computerized classification. METHODS: We recruited 100 patients with glottic neoplasm, 508 with phonotraumatic lesions, and 153 with unilateral vocal palsy. Statistical analyses revealed significantly different distributions of demographic and symptomatic variables. Machine learning algorithms, including decision tree, linear discriminant analysis, K-nearest neighbors, support vector machine, and artificial neural network, were applied to classify voice disorders. RESULTS: The results showed that demographic features were more effective for detecting neoplastic and phonotraumatic lesions, whereas symptoms were useful for detecting vocal palsy. When combining demographic and symptomatic variables, the artificial neural network achieved the highest accuracy of 83 ± 1.58%, whereas the accuracy achieved by other algorithms ranged from 74 to 82.6%. Decision tree analyses revealed that sex, age, smoking status, sudden onset of dysphonia, and 10-item voice handicap index scores were significant characteristics for classification. CONCLUSION: This study demonstrated a significant difference in demographic and symptomatic features between glottic neoplasm, phonotraumatic lesions, and vocal palsy. These features may facilitate automatic classification of voice disorders through machine learning algorithms.


Subject(s)
Neural Networks, Computer , Supervised Machine Learning , Voice Disorders/classification , Adult , Age Factors , Aged , Alcohol Drinking/epidemiology , Algorithms , Demography , Female , Glottis/injuries , Glottis/physiopathology , Humans , Laryngeal Neoplasms/complications , Laryngeal Neoplasms/diagnosis , Laryngeal Neoplasms/physiopathology , Male , Middle Aged , Retrospective Studies , Severity of Illness Index , Sex Factors , Smoking/epidemiology , Symptom Assessment , Vocal Cord Paralysis/complications , Vocal Cord Paralysis/diagnosis , Vocal Cord Paralysis/physiopathology , Voice Disorders/epidemiology , Voice Quality , Wounds and Injuries/diagnosis
11.
JAMA Otolaryngol Head Neck Surg ; 144(8): 657-665, 2018 08 01.
Article in English | MEDLINE | ID: mdl-29931028

ABSTRACT

Importance: A roadblock for research on adductor spasmodic dysphonia (ADSD), abductor SD (ABSD), voice tremor (VT), and muscular tension dysphonia (MTD) is the lack of criteria for selecting patients with these disorders. Objective: To determine the agreement among experts not using standard guidelines to classify patients with ABSD, ADSD, VT, and MTD, and develop expert consensus attributes for classifying patients for research. Design, Setting and Participants: From 2011 to 2016, a multicenter observational study examined agreement among blinded experts when classifying patients with ADSD, ABSD, VT or MTD (first study). Subsequently, a 4-stage Delphi method study used reiterative stages of review by an expert panel and 46 community experts to develop consensus on attributes to be used for classifying patients with the 4 disorders (second study). The study used a convenience sample of 178 patients clinically diagnosed with ADSD, ABSD, VT MTD, vocal fold paresis/paralysis, psychogenic voice disorders, or hypophonia secondary to Parkinson disease. Participants were aged 18 years or older, without laryngeal structural disease or surgery for ADSD and underwent speech and nasolaryngoscopy video recordings following a standard protocol. Exposures: Speech and nasolaryngoscopy video recordings following a standard protocol. Main Outcomes and Measures: Specialists at 4 sites classified 178 patients into 11 categories. Four international experts independently classified 75 patients using the same categories without guidelines after viewing speech and nasolaryngoscopy video recordings. Each member from the 4 sites also classified 50 patients from other sites after viewing video clips of voice/laryngeal tasks. Interrater κ less than 0.40 indicated poor classification agreement among rater pairs and across recruiting sites. Consequently, a Delphi panel of 13 experts identified and ranked speech and laryngeal movement attributes for classifying ADSD, ABSD, VT, and MTD, which were reviewed by 46 community specialists. Based on the median attribute rankings, a final attribute list was created for each disorder. Results: When classifying patients without guidelines, raters differed in their classification distributions (likelihood ratio, χ2 = 107.66), had poor interrater agreement, and poor agreement with site categories. For 11 categories, the highest agreement was 34%, with no κ values greater than 0.26. In external rater pairs, the highest κ was 0.23 and the highest agreement was 38.5%. Using 6 categories, the highest percent agreement was 73.3% and the highest κ was 0.40. The Delphi method yielded 18 attributes for classifying disorders from speech and nasolaryngoscopic examinations. Conclusions and Relevance: Specialists without guidelines had poor agreement when classifying patients for research, leading to a Delphi-based development of the Spasmodic Dysphonia Attributes Inventory for classifying patients with ADSD, ABSD, VT, and MTD for research.


Subject(s)
Voice Disorders/diagnosis , Adolescent , Adult , Aged , Aged, 80 and over , Delphi Technique , Diagnosis, Differential , Dysphonia/diagnosis , Humans , Laryngoscopy , Middle Aged , Observer Variation , Video Recording , Voice Disorders/classification , Voice Disorders/etiology , Young Adult
12.
J Speech Lang Hear Res ; 61(5): 1130-1139, 2018 05 17.
Article in English | MEDLINE | ID: mdl-29800353

ABSTRACT

Purpose: The purpose of this study is to introduce a chaos level test to evaluate linear and nonlinear voice type classification method performances under varying signal chaos conditions without subjective impression. Study Design: Voice signals were constructed with differing degrees of noise to model signal chaos. Within each noise power, 100 Monte Carlo experiments were applied to analyze the output of jitter, shimmer, correlation dimension, and spectrum convergence ratio. The computational output of the 4 classifiers was then plotted against signal chaos level to investigate the performance of these acoustic analysis methods under varying degrees of signal chaos. Method: A diffusive behavior detection-based chaos level test was used to investigate the performances of different voice classification methods. Voice signals were constructed by varying the signal-to-noise ratio to establish differing signal chaos conditions. Results: Chaos level increased sigmoidally with increasing noise power. Jitter and shimmer performed optimally when the chaos level was less than or equal to 0.01, whereas correlation dimension was capable of analyzing signals with chaos levels of less than or equal to 0.0179. Spectrum convergence ratio demonstrated proficiency in analyzing voice signals with all chaos levels investigated in this study. Conclusion: The results of this study corroborate the performance relationships observed in previous studies and, therefore, demonstrate the validity of the validation test method. The presented chaos level validation test could be broadly utilized to evaluate acoustic analysis methods and establish the most appropriate methodology for objective voice analysis in clinical practice.


Subject(s)
Diagnosis, Computer-Assisted/methods , Pattern Recognition, Automated/methods , Voice Disorders/classification , Voice , Computer Simulation , Humans , Monte Carlo Method , Nonlinear Dynamics , Signal Processing, Computer-Assisted , Voice Disorders/diagnosis
13.
Sultan Qaboos Univ Med J ; 18(3): e350-e354, 2018 Aug.
Article in English | MEDLINE | ID: mdl-30607277

ABSTRACT

OBJECTIVES: This study aimed to assess potential associations between self-reported symptoms of laryngopharyngeal reflux (LPR) and voice disorders among two undiagnosed cohorts in Saudi Arabia. METHODS: This cross-sectional study was conducted from February to April 2017 in Khobar, Saudi Arabia. Validated Arabic versions of the Reflux Symptom Index (RSI) and 10-item Voice Handicap Index (VHI-10) were distributed to 400 teachers at 13 schools and 300 members of the general population attending an ear, nose and throat clinic in Khobar. Scores of >13 and >11 on the RSI and VHI-10 indicated a potential subjective diagnosis of LPR and voice disorders, respectively. RESULTS: A total of 446 individuals took part in the study, including 260 members of the general population (response rate: 86.7%) and 186 teachers (response rate: 46.5%). The mean age was 32.5 years. In total, 62.2% complained of voice and/or reflux problems, with the remaining 37.8% not reporting/unaware of any problems in this regard. Among the teachers, 30.6% and 18.3% had positive RSI and VHI-10 scores, respectively, while 43.1% and 14.6% of the individuals from the general population had positive RSI and VHI-10 scores, respectively. Overall, VHI-10 scores were significantly associated with RSI scores (P <0.001). CONCLUSION: A significant association between RSI and VHI-10 scores suggests that there may be an association between LPR and voice disorders. These tools would therefore be a valuable method of monitoring patients; however, they cannot be used to confirm a diagnosis. Thus, more detailed studies are needed to confirm this association using a larger sample size.


Subject(s)
Faculty/statistics & numerical data , Laryngopharyngeal Reflux/classification , Voice Disorders/classification , Adult , Cohort Studies , Cross-Sectional Studies , Female , Humans , Male , Middle Aged , Risk Factors , Saudi Arabia , Self Report , Severity of Illness Index , Surveys and Questionnaires
14.
Codas ; 29(4): e20160187, 2017 Aug 24.
Article in Portuguese, English | MEDLINE | ID: mdl-28902229

ABSTRACT

OBJECTIVE: To verify the correlation between vocal tract discomfort symptoms and perceived voice handicaps in gospel singers, analyzing possible differences according to gender. METHODS: 100 gospel singers volunteered, 50 male and 50 female. All participants answered two questionnaires: Vocal Tract Discomfort (VTD) scale and the Modern Singing Handicap Index (MSHI) that investigates the vocal handicap perceived by singers, linking the results of both instruments (p<0.05). RESULTS: Women presented more perceived handicaps and also more frequent and higher intensity vocal tract discomfort. Furthermore, the more frequent and intense the vocal tract symptoms, the higher the vocal handicap for singing. CONCLUSION: Female gospel singers present higher frequency and intensity of vocal tract discomfort symptoms, as well as higher voice handicap for singing than male gospel singers. The higher the frequency and intensity of the laryngeal symptoms, the higher the vocal handicap will be.


Subject(s)
Singing , Voice Disorders/diagnosis , Voice Quality , Adolescent , Adult , Aged , Brazil , Cross-Sectional Studies , Diagnostic Self Evaluation , Female , Humans , Male , Middle Aged , Occupational Diseases/diagnosis , Quality of Life , Religion , Sex Factors , Surveys and Questionnaires , Voice Disorders/classification , Young Adult
15.
J Voice ; 31(4): 515.e1-515.e8, 2017 Jul.
Article in English | MEDLINE | ID: mdl-28262502

ABSTRACT

OBJECTIVES: Development of a noninvasive method for separating different vocal fold diseases is an important issue concerning vocal analysis. Due to the time variations along a pathologic vocal signal, application of dynamic pattern modeling tools is expected to help in the detection of defects that occur in the speech production mechanism. MATERIALS AND METHODS: In the present study, the hidden Markov model, which is a state space model, is employed to sort some of the vocal diseases. Moreover, this research mainly investigates the effects of the processed vocal signal lengths on the mentioned sorting task. To this end, the signal lengths of 1, 3, and 5 seconds of different disorders are used. RESULTS: The experimental results show that some pathologic conditions in vocal folds such as cyst, false vocal cord, and mass are more evident in continued voice production, and the recognition accuracies gained via dynamic modeling of pathologic voice signals with more lengths are considerably improved.


Subject(s)
Phonation , Voice Disorders/diagnosis , Adult , Aged , Humans , Male , Markov Chains , Middle Aged , Models, Theoretical , Voice Disorders/classification , Voice Disorders/physiopathology
16.
J Voice ; 31(1): 16-23, 2017 Jan.
Article in English | MEDLINE | ID: mdl-26920858

ABSTRACT

OBJECTIVE/HYPOTHESIS: The purpose of this paper is to introduce the rate of divergence as an objective measure to differentiate between the four voice types based on the amount of disorder present in a signal. We hypothesized that rate of divergence would provide an objective measure that can quantify all four voice types. STUDY DESIGN: A total of 150 acoustic voice recordings were randomly selected and analyzed using traditional perturbation, nonlinear, and rate of divergence analysis methods. METHODS: We developed a new parameter, rate of divergence, which uses a modified version of Wolf's algorithm for calculating Lyapunov exponents of a system. The outcome of this calculation is not a Lyapunov exponent, but rather a description of the divergence of two nearby data points for the next three points in the time series, followed in three time-delayed embedding dimensions. This measure was compared to currently existing perturbation and nonlinear dynamic methods of distinguishing between voice signals. RESULTS: There was a direct relationship between voice type and rate of divergence. This calculation is especially effective at differentiating between type 3 and type 4 voices (P < 0.001) and is equally effective at differentiating type 1, type 2, and type 3 signals as currently existing methods. CONCLUSION: The rate of divergence calculation introduced is an objective measure that can be used to distinguish between all four voice types based on the amount of disorder present, leading to quicker and more accurate voice typing as well as an improved understanding of the nonlinear dynamics involved in phonation.


Subject(s)
Acoustics , Phonation , Signal Processing, Computer-Assisted , Speech Production Measurement/methods , Speech-Language Pathology/methods , Voice Disorders/diagnosis , Voice Quality , Adolescent , Adult , Aged , Aged, 80 and over , Algorithms , Child , Female , Humans , Male , Middle Aged , Nonlinear Dynamics , Pattern Recognition, Automated , Predictive Value of Tests , Sound Spectrography , Voice Disorders/classification , Voice Disorders/physiopathology , Young Adult
17.
J Voice ; 31(1): 125.e7-125.e16, 2017 Jan.
Article in English | MEDLINE | ID: mdl-26922093

ABSTRACT

OBJECTIVE: The purposes of this literature review were (1) to identify and assess frameworks for clinical characterization of episodic laryngeal breathing disorders (ELBD) and their subtypes, (2) to integrate concepts from these frameworks into a novel theoretical paradigm, and (3) to provide a preliminary algorithm to classify clinical features of ELBD for future study of its clinical manifestations and underlying pathophysiological mechanisms. STUDY DESIGN: This is a literature review. METHODS: Peer-reviewed literature from 1983 to 2015 pertaining to models for ELBD was searched using Pubmed, Ovid, Proquest, Cochrane Database of Systematic Reviews, and Google Scholar. Theoretical models for ELBD were identified, evaluated, and integrated into a novel comprehensive framework. Consensus across three salient models provided a working definition and inclusionary criteria for ELBD within the new framework. Inconsistencies and discrepancies within the models provided an analytic platform for future research. RESULTS: Comparison among three conceptual models-(1) Irritable larynx syndrome, (2) Dichotomous triggers, and (3) Periodic occurrence of laryngeal obstruction-showed that the models uniformly consider ELBD to involve episodic laryngeal obstruction causing dyspnea. The models differed in their description of source of dyspnea, in their inclusion of corollary behaviors, in their inclusion of other laryngeal-based behaviors (eg, cough), and types of triggers. CONCLUSION: The proposed integrated theoretical framework for ELBD provides a preliminary systematic platform for the identification of key clinical feature patterns indicative of ELBD and associated clinical subgroups. This algorithmic paradigm should evolve with better understanding of this spectrum of disorders and its underlying pathophysiological mechanisms.


Subject(s)
Laryngeal Diseases/diagnosis , Larynx/physiopathology , Models, Theoretical , Respiration Disorders/diagnosis , Respiration , Terminology as Topic , Voice Disorders/diagnosis , Algorithms , Consensus , Diagnostic Errors , Humans , Laryngeal Diseases/classification , Laryngeal Diseases/etiology , Laryngeal Diseases/physiopathology , Laryngostenosis/diagnosis , Laryngostenosis/physiopathology , Predictive Value of Tests , Respiration Disorders/classification , Respiration Disorders/etiology , Respiration Disorders/physiopathology , Risk Factors , Vocal Cords/physiopathology , Voice , Voice Disorders/classification , Voice Disorders/etiology , Voice Disorders/physiopathology
18.
J Voice ; 31(1): 3-15, 2017 Jan.
Article in English | MEDLINE | ID: mdl-26992554

ABSTRACT

OBJECTIVES AND BACKGROUND: Automatic voice pathology detection and classification systems effectively contribute to the assessment of voice disorders, which helps clinicians to detect the existence of any voice pathologies and the type of pathology from which patients suffer in the early stages. This work concentrates on developing an accurate and robust feature extraction for detecting and classifying voice pathologies by investigating different frequency bands using correlation functions. In this paper, we extracted maximum peak values and their corresponding lag values from each frame of a voiced signal by using correlation functions as features to detect and classify pathological samples. These features are investigated in different frequency bands to see the contribution of each band on the detection and classification processes. MATERIAL AND METHODS: Various samples of sustained vowel /a/ of normal and pathological voices were extracted from three different databases: English, German, and Arabic. A support vector machine was used as a classifier. We also performed a t test to investigate the significant differences in mean of normal and pathological samples. RESULTS: The best achieved accuracies in both detection and classification were varied depending on the band, the correlation function, and the database. The most contributive bands in both detection and classification were between 1000 and 8000 Hz. In detection, the highest acquired accuracies when using cross-correlation were 99.809%, 90.979%, and 91.168% in the Massachusetts Eye and Ear Infirmary, Saarbruecken Voice Database, and Arabic Voice Pathology Database databases, respectively. However, in classification, the highest acquired accuracies when using cross-correlation were 99.255%, 98.941%, and 95.188% in the three databases, respectively.


Subject(s)
Acoustics , Signal Processing, Computer-Assisted , Speech Acoustics , Speech Production Measurement/methods , Speech-Language Pathology/methods , Voice Disorders/diagnosis , Voice Quality , Databases, Factual , Humans , Pattern Recognition, Automated , Voice Disorders/classification , Voice Disorders/physiopathology
19.
J Voice ; 31(3): 384.e9-384.e14, 2017 May.
Article in English | MEDLINE | ID: mdl-27743845

ABSTRACT

OBJECTIVES: Speech signal processing techniques have provided several contributions to pathologic voice identification, in which healthy and unhealthy voice samples are evaluated. A less common approach is to identify laryngeal pathologies, for which the use of a noninvasive method for pathologic voice identification is an important step forward for preliminary diagnosis. In this study, a hierarchical classifier and a combination of systems are used to improve the accuracy of a three-class identification system (healthy, physiological larynx pathologies, and neuromuscular larynx pathologies). METHOD: Three main subject classes were considered: subjects with physiological larynx pathologies (vocal fold nodules and edemas: 59 samples), subjects with neuromuscular larynx pathologies (unilateral vocal fold paralysis: 59 samples), and healthy subjects (36 samples). The variables used in this study were a speech task (sustained vowel /a/ or continuous reading speech), features with or without perceptual information, and features with or without direct information about formants evaluated using single classifiers. A hierarchical classification system was designed based on this information. RESULTS: The resulting system combines an analysis of continuous speech by way of the commonly used sustained vowel /a/ to obtain spectral and perceptual speech features. It achieved an accuracy of 84.4%, which represents an improvement of approximately 9% compared with the stand-alone approach. For pathologic voice identification, the accuracy obtained was 98.7%, and the identification accuracy for the two pathology classes was 81.3%. CONCLUSIONS: Hierarchical classification and system combination create significant benefits and introduce a modular approach to the classification of larynx pathologies.


Subject(s)
Diagnosis, Computer-Assisted/methods , Edema/diagnosis , Signal Processing, Computer-Assisted , Speech Production Measurement/methods , Support Vector Machine , Vocal Cord Paralysis/diagnosis , Vocal Cords , Voice Disorders/diagnosis , Case-Control Studies , Databases, Factual , Edema/classification , Edema/pathology , Edema/physiopathology , Female , Humans , Male , Pattern Recognition, Automated , Predictive Value of Tests , Speech Acoustics , Vocal Cord Paralysis/classification , Vocal Cord Paralysis/pathology , Vocal Cord Paralysis/physiopathology , Vocal Cords/pathology , Vocal Cords/physiopathology , Voice Disorders/classification , Voice Disorders/pathology , Voice Disorders/physiopathology , Voice Quality
20.
J Voice ; 31(1): 113.e9-113.e18, 2017 Jan.
Article in English | MEDLINE | ID: mdl-27105857

ABSTRACT

BACKGROUND AND OBJECTIVE: Automatic voice-pathology detection and classification systems may help clinicians to detect the existence of any voice pathologies and the type of pathology from which patients suffer in the early stages. The main aim of this paper is to investigate Multidimensional Voice Program (MDVP) parameters to automatically detect and classify the voice pathologies in multiple databases, and then to find out which parameters performed well in these two processes. MATERIALS AND METHODS: Samples of the sustained vowel /a/ of normal and pathological voices were extracted from three different databases, which have three voice pathologies in common. The selected databases in this study represent three distinct languages: (1) the Arabic voice pathology database; (2) the Massachusetts Eye and Ear Infirmary database (English database); and (3) the Saarbruecken Voice Database (German database). A computerized speech lab program was used to extract MDVP parameters as features, and an acoustical analysis was performed. The Fisher discrimination ratio was applied to rank the parameters. A t test was performed to highlight any significant differences in the means of the normal and pathological samples. RESULTS: The experimental results demonstrate a clear difference in the performance of the MDVP parameters using these databases. The highly ranked parameters also differed from one database to another. The best accuracies were obtained by using the three highest ranked MDVP parameters arranged according to the Fisher discrimination ratio: these accuracies were 99.68%, 88.21%, and 72.53% for the Saarbruecken Voice Database, the Massachusetts Eye and Ear Infirmary database, and the Arabic voice pathology database, respectively.


Subject(s)
Acoustics , Speech Acoustics , Speech Production Measurement/methods , Voice Disorders/diagnosis , Voice Quality , Area Under Curve , Automation , Databases, Factual , Humans , Pattern Recognition, Automated , Predictive Value of Tests , ROC Curve , Reproducibility of Results , Sound Spectrography , Voice Disorders/classification , Voice Disorders/physiopathology
SELECTION OF CITATIONS
SEARCH DETAIL
...