Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 63
Filter
1.
Front Neurol ; 13: 960012, 2022.
Article in English | MEDLINE | ID: mdl-36081868

ABSTRACT

For supporting clinical decision-making in audiology, Common Audiological Functional Parameters (CAFPAs) were suggested as an interpretable intermediate representation of audiological information taken from various diagnostic sources within a clinical decision-support system (CDSS). Ten different CAFPAs were proposed to represent specific functional aspects of the human auditory system, namely hearing threshold, supra-threshold deficits, binaural hearing, neural processing, cognitive abilities, and a socio-economic component. CAFPAs were established as a viable basis for deriving audiological findings and treatment recommendations, and it has been demonstrated that model-predicted CAFPAs, with machine learning models trained on expert-labeled patient cases, are sufficiently accurate to be included in a CDSS, but it requires further validation by experts. The present study aimed to validate model-predicted CAFPAs based on previously unlabeled cases from the same data set. Here, we ask to which extent domain experts agree with the model-predicted CAFPAs and whether potential disagreement can be understood in terms of patient characteristics. To these aims, an expert survey was designed and applied to two highly-experienced audiology specialists. They were asked to evaluate model-predicted CAFPAs and estimate audiological findings of the given audiological information about the patients that they were presented with simultaneously. The results revealed strong relative agreement between the two experts and importantly between experts and the prediction for all CAFPAs, except for the neural processing and binaural hearing-related ones. It turned out, however, that experts tend to score CAFPAs in a larger value range, but, on average, across patients with smaller scores as compared with the machine learning models. For the hearing threshold-associated CAFPA in frequencies smaller than 0.75 kHz and the cognitive CAFPA, not only the relative agreement but also the absolute agreement between machine and experts was very high. For those CAFPAs with an average difference between the model- and expert-estimated values, patient characteristics were predictive of the disagreement. The findings are discussed in terms of how they can help toward further improvement of model-predicted CAFPAs to be incorporated in a CDSS for audiology.

2.
Biomech Model Mechanobiol ; 14(1): 169-84, 2015 Jan.
Article in English | MEDLINE | ID: mdl-24861998

ABSTRACT

Laryngeal cancer due to, e.g., extensive smoking and/or alcohol consumption can necessitate the excision of the entire larynx. After such a total laryngectomy, the voice generating structures are lost and with that the quality of life of the concerning patients is drastically reduced. However, the vibrations of the remaining tissue in the so called pharyngoesophageal (PE) segment can be applied as alternative sound generator. Tissue, scar, and geometric aspects of the PE-segment determine the postoperative substitute voice characteristic, being highly important for the future live of the patient. So far, PE-dynamics are simulated by a biomechanical model which is restricted to stationary vibrations, i.e., variations in pitch and amplitude cannot be handled. In order to investigate the dynamical range of PE-vibrations, knowledge about the temporal processes during substitute voice production is of crucial interest. Thus, time-dependent model parameters are suggested in order to quantify non-stationary PE-vibrations and drawing conclusions on the temporal characteristics of tissue stiffness, oscillating mass, pressure, and geometric distributions within the PE-segment. To adapt the numerical model to the PE-vibrations, an automatic, block-based optimization procedure is applied, comprising a combined global and local optimization approach. The suggested optimization procedure is validated with 75 synthetic data sets, simulating non-stationary oscillations of differently shaped PE-segments. The application to four high-speed recordings is shown and discussed. The correlation between model and PE-dynamics is ≥ 97%.


Subject(s)
Esophagus/physiopathology , Esophagus/surgery , Laryngectomy , Larynx/physiopathology , Larynx/surgery , Models, Biological , Computer Simulation , Female , Humans , Male , Middle Aged , Oscillometry/methods , Time Factors , Treatment Outcome , Vibration
3.
J Speech Lang Hear Res ; 57(4): 1148-61, 2014 Aug.
Article in English | MEDLINE | ID: mdl-24686496

ABSTRACT

PURPOSE: The aim of this study was to identify parameters that would differentiate healthy from pathological organic-based vocal fold vibrations to emphasize clinical usefulness of high-speed imaging. METHOD: Fifty-five men (M age = 36 years, SD = 20 years) were examined and separated into 4 groups: 1 healthy (26 individuals) and 3 pathological (10 individuals with contact granuloma, 12 with polyps, and 7 with cysts). Vocal fold vibrations were recorded using a high-speed camera during sustained phonation. Twenty objective glottal area waveform and 24 phonovibrogram parameters representing spatiotemporal characteristics were analyzed. Statistical group comparisons were performed to document spatiotemporal changes for organic lesions that cannot be determined visually. To look for specific pattern profiles within organic lesions, the authors performed linear discriminant analysis. RESULTS: Thirteen parameters showed significant differences between the healthy group and at least 1 pathological group. The differences occurred more in temporal than in spatial parameters. Contact granuloma showed the fewest statistical differences (3 parameters), followed by cysts (9 parameters), and polyps (10 parameters). Linear discriminant analysis achieved accuracy performance of 76% (all groups separated) and 82% (healthy vs. pathological). CONCLUSION: The results suggest that for males, the differences between healthy voices and organic voice disorders may be more pronounced within temporal characteristics that cannot be visually detected without high-speed imaging.


Subject(s)
Image Processing, Computer-Assisted/methods , Laryngeal Diseases/diagnostic imaging , Laryngoscopy/methods , Vocal Cords/diagnostic imaging , Adult , Glottis/diagnostic imaging , Granuloma, Laryngeal/diagnostic imaging , Granuloma, Laryngeal/physiopathology , Humans , Laryngeal Diseases/physiopathology , Male , Middle Aged , Phonation/physiology , Polyps/diagnostic imaging , Spatio-Temporal Analysis , Vibration , Video Recording , Vocal Cords/physiopathology , Voice Disorders/diagnostic imaging , Voice Disorders/physiopathology
4.
J Speech Lang Hear Res ; 57(2): S637-47, 2014 Apr 01.
Article in English | MEDLINE | ID: mdl-24686925

ABSTRACT

PURPOSE Previous studies have confirmed the influence of dehydration and an altered mucus (e.g., due to pathologies) on phonation. However, the underlying reasons for these influences are not fully understood. This study was a preliminary inquiry into the influences of mucus architecture and concentration on vocal fold oscillation. METHOD Two excised human larynges were investigated in an in vitro setup. The oscillations of the vocal folds at various airflow volume rates were recorded through the use of high-speed imaging. Engineered mucus containing polymers (interconnected polymers and linear polymers) was applied to the vocal folds. From the high-speed footage, glottal parameters were extracted through the use of objective methods and were compared to a gold standard (physiological saline solution). RESULTS Variations were found for all applications of mucus. Fundamental frequency dropped and the oscillatory behavior (speed quotient [SQ], closing quotient [CQ]) changed for both larynges. The 2 applied mucus architectures displayed different effects on the larynges. The interconnected polymer displayed clear low-pass filter characteristics not found for the linear polymer. Increase of polymer concentration affected parameters to a certain point. CONCLUSION The data confirm results found in previous studies. Furthermore, the different effects-comparing architecture and concentration-suggest that, in the future, synthetic mucus can be designed to improve phonation.


Subject(s)
Larynx/physiology , Mucus/physiology , Phonation/physiology , Voice Disorders/physiopathology , Voice Disorders/therapy , Aged, 80 and over , Cadaver , Female , Humans , Male , Middle Aged , Pilot Projects , Polymers/pharmacology , Sodium Chloride/pharmacology , Solutions/pharmacology , Sound Spectrography , Vibration , Voice/physiology
5.
Laryngoscope ; 123(7): 1686-93, 2013 Jul.
Article in English | MEDLINE | ID: mdl-23649746

ABSTRACT

OBJECTIVES/HYPOTHESIS: Quantitative analysis of endoscopic high-speed video recordings of vocal fold vibrations has been growing in importance in recent years. The videos have mainly been analyzed using subjective evaluation, but this is examiner dependent, and the results show inadequate interobserver agreement. The aims of this study were therefore to identify appropriate objective parameters for analyzing high-speed recordings to differentiate healthy voice production from organic disorders. METHODS: A total of 152 females were examined, divided into 77 healthy and 75 with four different pathological conditions: laryngeal epithelial thickening, Reinke edema, vocal fold polyps, and vocal fold cysts. Vocal fold vibrations were recorded with a high-speed camera (4,000 Hz, 256 × 256 pixels) during sustained phonation. Parameters computed from the glottal area waveform (GAW) and from phonovibrogram (PVG) were analyzed. Multiparametric linear discriminant analysis was performed to classify pathological conditions versus the healthy group. RESULTS: Twenty of 44 parameters were identified that are capable of distinguishing between the individual types of pathology. PVG parameters showed better performance than GAW parameters. Parameters representing vibrational periodicity via standard deviation showed better performance than absolute parameters. In addition, linear discriminant analysis achieved reliable differentiation between healthy and pathological vocal fold vibrations: 72% for the five-class problem (all groups separately) and 88% for the two-class problem (healthy vs. all pathologies taken as one class). CONCLUSIONS: The study succeeded in defining objective parameters for analyzing endoscopic high-speed videos and suggesting first parameters for differentiation between healthy dynamics and dynamics of organic pathologies.


Subject(s)
Laryngeal Diseases/diagnosis , Laryngoscopy/instrumentation , Video Recording , Vocal Cords/pathology , Adult , Case-Control Studies , Discriminant Analysis , Female , Humans , Laryngeal Diseases/pathology , Middle Aged , Phonation , Statistics, Nonparametric , Vibration
6.
Logoped Phoniatr Vocol ; 38(1): 1-10, 2013 Apr.
Article in English | MEDLINE | ID: mdl-22414332

ABSTRACT

PURPOSE: Digit span and sentence repetition are identified as potential markers for specific language impairment (SLI). We investigated if language learning of bilingual children with suspected language impairment (biSLI) was also influenced and led by memory constraints. METHOD: In a retrospective study, 19 children with SLI and 25 controls (ages 4;9-5;9), as well as 15 biSLI children and 14 controls (ages 5;1-8;9) were compared with regard to their performance on a digit span and sentence repetition task. RESULTS: Both groups with language impairment (SLI/biSLI) showed reduced performance on both tasks. Sentence repetition predicted language comprehension, and the digit span task predicted grammar abilities of the SLI, biSLI, and their controls. CONCLUSION: Sentence repetition and short-term memory provide information on language comprehension and grammar abilities in language-impaired mono- and bilingual children and confirm their function as SLI markers.


Subject(s)
Child Behavior , Child Language , Language Disorders/diagnosis , Multilingualism , Child , Child, Preschool , Comprehension , Female , Humans , Infant , Language , Language Disorders/psychology , Language Tests , Linear Models , Male , Memory, Short-Term , Retrospective Studies , Vocabulary
7.
J Voice ; 26(6): 726-33, 2012 Nov.
Article in English | MEDLINE | ID: mdl-22632795

ABSTRACT

SUMMARY: Acoustic and endoscopic voice assessments are routinely performed to determine the vocal fold vibratory function as part of the voice assessment protocol in clinics. More often than not these data are separately recorded, resulting in information being obtained from two different phonation segments and an increase of time for the voice evaluation process. This study explores the use of acoustic data, simultaneously recorded during high-speed endoscopy (HSE), for the evaluation of vocal fold function. PATIENTS AND METHODS: HSE and acoustic data were recorded from the subjects simultaneously during sustained phonation. The data included voices of 73 healthy subjects, 148 paresis, 210 functional dysphonias, and 119 benign lesions of vocal folds. For this study, only acoustic data were analyzed using Dr. Speech software (Tiger electronics Inc., MA). Twelve parameters were computed; 82% of the acoustic voice recordings could be analyzed. Statistical analysis was performed with SPSS 17.0. RESULTS: Acoustic data was easily recorded simultaneously allowing analyses of the same phonation segment to determine vocal fold function and therefore eliminating the need for another voice recording. The acoustic voice parameters differed between genders in the healthy voice group. Most of the parameters showed significant differences between healthy and pathological groups. CONCLUSION: Simultaneously recorded endoscopic and acoustic data is valuable. Differentiation between healthy and pathological groups was possible using acoustic data only. We suggest that the synchronously recorded acoustic signal is of sufficient quality for objective analysis yielding reduced examination time.


Subject(s)
Acoustics , Dysphonia/diagnosis , Laryngoscopy , Phonation , Speech Production Measurement , Vocal Cords/physiopathology , Voice Quality , Adolescent , Adult , Aged , Biomechanical Phenomena , Case-Control Studies , Discriminant Analysis , Dysphonia/pathology , Dysphonia/physiopathology , Factor Analysis, Statistical , Female , Humans , Linear Models , Male , Middle Aged , Predictive Value of Tests , Signal Processing, Computer-Assisted , Time Factors , Vibration , Vocal Cord Paralysis/diagnosis , Vocal Cord Paralysis/pathology , Vocal Cord Paralysis/physiopathology , Vocal Cords/pathology , Young Adult
9.
J Voice ; 26(4): 416-24, 2012 Jul.
Article in English | MEDLINE | ID: mdl-21940144

ABSTRACT

OBJECTIVES/HYPOTHESIS: Automatic voice evaluation is usually performed on stable sections of sustained vowels, which often cannot capture hoarseness properly. The measures cepstral peak prominence (CPP) and smoothed CPP (CPPS) do not require exact determination of the cycles of fundamental frequency like established perturbation-based measures. They can also be applied to text recordings. In this study, they were compared with perceptual evaluation of voice quality and the German roughness-breathiness-hoarseness (RBH) scheme. STUDY DESIGN: Retrospective data analysis. METHODS: Seventy-three hoarse patients (48.3±16.8 years) uttered the vowel /e/ and read the German version of the text "The North Wind and the Sun". The text recordings were evaluated perceptually by five speech therapists and physicians according to the RBH scale. The criterion "overall quality" was measured on a 4-point scale and a visual analog scale. For the human-machine correlation, the automatic measures of the Praat program (vowels only) and the "cpps" software were compared with the experts' ratings. The experiments were repeated for speakers with jitter ≤5% or shimmer ≤5% (n=47). RESULTS: For the entire group (n=73), the best human-machine results for most of the rating criteria were obtained for text-based CPP and CPPS (up to |ρ|=0.73). For the 47 selected speakers, the correlation was remarkably worse for all measures but still best for text-based CPP and CPPS (|ρ|≤0.50). CONCLUSIONS: Cepstrum analysis should be performed on a text recording. Then, it outperforms all perturbation-based measures, and it can be a meaningful objective support for perceptual analysis.


Subject(s)
Hoarseness , Speech Acoustics , Adult , Aged , Aged, 80 and over , Chronic Disease , Female , Humans , Male , Middle Aged , Retrospective Studies , Speech Perception , Young Adult
10.
J Voice ; 26(3): 390-7, 2012 May.
Article in English | MEDLINE | ID: mdl-21820272

ABSTRACT

OBJECTIVE: One aspect of voice and speech evaluation after laryngeal cancer is acoustic analysis. Perceptual evaluation by expert raters is a standard in the clinical environment for global criteria such as overall quality or intelligibility. So far, automatic approaches evaluate acoustic properties of pathologic voices based on voiced/unvoiced distinction and fundamental frequency analysis of sustained vowels. Because of the high amount of noisy components and the increasing aperiodicity of highly pathologic voices, a fully automatic analysis of fundamental frequency is difficult. We introduce a purely data-driven system for the acoustic analysis of pathologic voices based on recordings of a standard text. METHODS: Short-time segments of the speech signal are analyzed in the spectral domain, and speaker models based on this information are built. These speaker models act as a clustered representation of the acoustic properties of a person's voice and are thus characteristic for speakers with different kinds and degrees of pathologic conditions. The system is evaluated on two different data sets with speakers reading standardized texts. One data set contains 77 speakers after laryngeal cancer treated with partial removal of the larynx. The other data set contains 54 totally laryngectomized patients, equipped with a Provox shunt valve. Each speaker was rated by five expert listeners regarding three different criteria: strain, voice quality, and speech intelligibility. RESULTS/CONCLUSION: We show correlations for each data set with r and ρ≥0.8 between the automatic system and the mean value of the five raters. The interrater correlation of one rater to the mean value of the remaining raters is in the same range. We thus assume that for selected evaluation criteria, the system can serve as a validated objective support for acoustic voice and speech analysis.


Subject(s)
Laryngeal Neoplasms/surgery , Laryngectomy , Models, Statistical , Speech Acoustics , Speech Intelligibility , Speech Production Measurement/methods , Voice Disorders/surgery , Voice Quality , Adult , Aged , Aged, 80 and over , Automation , Germany , Humans , Laryngeal Neoplasms/complications , Laryngeal Neoplasms/physiopathology , Laryngectomy/adverse effects , Larynx, Artificial , Middle Aged , Observer Variation , Predictive Value of Tests , Reading , Regression Analysis , Reproducibility of Results , Signal Processing, Computer-Assisted , Speech, Alaryngeal/instrumentation , Time Factors , Treatment Outcome , Voice Disorders/diagnosis , Voice Disorders/etiology , Voice Disorders/physiopathology
11.
J Acoust Soc Am ; 130(2): 948-64, 2011 Aug.
Article in English | MEDLINE | ID: mdl-21877808

ABSTRACT

With the use of an endoscopic, high-speed camera, vocal fold dynamics may be observed clinically during phonation. However, observation and subjective judgment alone may be insufficient for clinical diagnosis and documentation of improved vocal function, especially when the laryngeal disease lacks any clear morphological presentation. In this study, biomechanical parameters of the vocal folds are computed by adjusting the corresponding parameters of a three-dimensional model until the dynamics of both systems are similar. First, a mathematical optimization method is presented. Next, model parameters (such as pressure, tension and masses) are adjusted to reproduce vocal fold dynamics, and the deduced parameters are physiologically interpreted. Various combinations of global and local optimization techniques are attempted. Evaluation of the optimization procedure is performed using 50 synthetically generated data sets. The results show sufficient reliability, including 0.07 normalized error, 96% correlation, and 91% accuracy. The technique is also demonstrated on data from human hemilarynx experiments, in which a low normalized error (0.16) and high correlation (84%) values were achieved. In the future, this technique may be applied to clinical high-speed images, yielding objective measures with which to document improved vocal function of patients with voice disorders.


Subject(s)
Computer Simulation , Models, Biological , Phonation , Vocal Cords/physiology , Voice , Algorithms , Biomechanical Phenomena , Female , Humans , Laryngoscopy , Male , Pressure , Reproducibility of Results , Vibration , Video Recording , Vocal Cords/anatomy & histology
12.
Logoped Phoniatr Vocol ; 36(4): 175-81, 2011 Dec.
Article in English | MEDLINE | ID: mdl-21875389

ABSTRACT

Objective assessment of intelligibility on the telephone is desirable for voice and speech assessment and rehabilitation. A total of 82 patients after partial laryngectomy read a standardized text which was synchronously recorded by a headset and via telephone. Five experienced raters assessed intelligibility perceptually on a five-point scale. Objective evaluation was performed by support vector regression on the word accuracy (WA) and word correctness (WR) of a speech recognition system, and a set of prosodic features. WA and WR alone exhibited correlations to human evaluation between |r| = 0.57 and |r| = 0.75. The correlation was r = 0.79 for headset and r = 0.86 for telephone recordings when prosodic features and WR were combined. The best feature subset was optimal for both signal qualities. It consists of WR, the average duration of the silent pauses before a word, the standard deviation of the fundamental frequency on the entire sample, the standard deviation of jitter, and the ratio of the durations of the voiced sections and the entire recording.


Subject(s)
Laryngectomy/adverse effects , Signal Processing, Computer-Assisted , Speech Acoustics , Speech Intelligibility , Speech Perception , Speech Recognition Software , Telephone , Voice Quality , Adult , Aged , Aged, 80 and over , Automation , Cluster Analysis , Female , Germany , Humans , Male , Markov Chains , Middle Aged , Sound Spectrography , Speech Production Measurement , Speech, Alaryngeal , Support Vector Machine , Time Factors
13.
Open Neurol J ; 5: 37-45, 2011.
Article in English | MEDLINE | ID: mdl-21643536

ABSTRACT

We examined the neural activation to consonant-vowel transitions by cortical auditory evoked potentials (AEPs). The aim was to show whether cortical response patterns to speech stimuli contain components due to one of the temporal features, the voice-onset time (VOT). In seven normal-hearing adults, the cortical responses to four different monosyllabic words were opposed to the cortical responses to noise stimuli with the same temporal envelope as the speech stimuli. Significant hemispheric asymmetries were found for speech but not in noise evoked potentials. The difference signals between the AEPs to speech and corresponding noise stimuli revealed a significant negative component, which correlated with the VOT. The hemispheric asymmetries can be referred to rapid spectral changes. The correlation with the VOT indicates that the significant component in the difference signal reflects the perception of the acoustic change within the consonant-vowel transition. Thus, at the level of automatic processing, the characteristics of speech evoked potentials appear to be determined primarily by temporal aspects of the eliciting stimuli.

14.
IEEE Trans Biomed Eng ; 58(10): 2767-76, 2011 Oct.
Article in English | MEDLINE | ID: mdl-21558056

ABSTRACT

After total larynx excision due to laryngeal cancer, the tracheoesophageal substitute tissue vibrations at the intersection between the pharynx and the esophagus [pharyngoesophageal segment (PE segment)] serve as voice generator. The quality of the substitute voice significantly depends on the vibratory characteristics of the PE segment. For improving voice rehabilitation, the relationship between the PE dynamics and the resulting substitute voice quality is a matter of particular interest. Precondition for a comprehensive analysis of this relationship is an objective quantification of the PE vibrations. For quantification purposes, a method is proposed, which is based on the reproduction of the tissue vibrations by means of a biomechanical model of the PE segment. An optimization procedure for an automatic determination of appropriate model parameters is suggested to adapt the model dynamics to tissue movements extracted from high-speed (HS) videos. The applicability of the optimization procedure is evaluated with ten synthetic data sets. A mean error of 8.2% for the determination of previously defined model parameters was achieved as well as an overall stability of 7.1%. The application of the model to six HS recordings presented a mean correlation of the vibration patterns of 82%.


Subject(s)
Esophagus/physiology , Laryngectomy/rehabilitation , Models, Biological , Pharynx/physiology , Phonation/physiology , Signal Processing, Computer-Assisted , Biomechanical Phenomena/physiology , Esophagus/anatomy & histology , Humans , Larynx, Artificial , Male , Middle Aged , Pharynx/anatomy & histology , Vibration , Video Recording , Voice
15.
Gerontology ; 57(2): 109-14, 2011.
Article in English | MEDLINE | ID: mdl-20424428

ABSTRACT

BACKGROUND: Many studies have referred to the effects of age on voice and the consequences of these changes. However, only little is known about the adverse effects of voice changes on quality of life in the elderly. OBJECTIVE: This study focuses on self-perception of voice in seniors as assessed by the Voice-Related Quality of Life (V-RQOL) questionnaire, on voice quality as measured by the Dysphonia Severity Index (DSI) and on the correlation between these parameters. METHODS: V-RQOL and DSI were measured as previously described in 107 non-treatment-seeking test persons without voice complaints (76 women and 31 men; mean age 78.7 ± 6.8 years, range 66-94 years). RESULTS: The mean V-RQOL value was 94.4 ± 9.8%. The mean value of the DSI in all participants was 1.2 ± 2.4. There was no significant correlation between the V-RQOL and DSI, either in women (p = 0.11), men (p = 0.58) or the whole study group (p = 0.26). CONCLUSION: Both the V-RQOL questionnaire and the DSI may be applied to seniors. As self-perception of voice and voice function do not correlate, both parameters have to be measured for voice assessment.


Subject(s)
Aging/psychology , Dysphonia/diagnosis , Dysphonia/psychology , Quality of Life , Voice Quality , Aged , Aged, 80 and over , Female , Germany , Humans , Male , Mental Status Schedule , Severity of Illness Index , Surveys and Questionnaires
16.
J Voice ; 25(3): 265-8, 2011 May.
Article in English | MEDLINE | ID: mdl-20202787

ABSTRACT

OBJECTIVE: Restrictions of verbal communication and high prevalence of voice disorders in the elderly are suspected to influence the quality of life. For assessment, both voice-specific and unspecific methods are already established and fundamental components of clinical diagnostics, but the question of correlation between voice-related and general health-related quality of life is still open in this subpopulation. METHODS: One hundred and seven socially active persons aged 65+ years were recruited and asked to complete Voice-Related Quality of Life (V-RQOL) and Short Form (SF)-36 questionnaires. RESULTS: There was a mild correlation between V-RQOL score and both of the SF-36 subscores (r(s)=0.28 for the physical subscore and r(s)=0.27 for the mental subscore). CONCLUSION: As correlation of voice- and health-related quality of life in elderly persons is only mild, both voice-specific and unspecific assessment methods are required for clinical diagnostics.


Subject(s)
Aging/psychology , Geriatric Assessment , Quality of Life , Surveys and Questionnaires , Voice Disorders/diagnosis , Voice Quality , Age Factors , Aged , Aged, 80 and over , Analysis of Variance , Cross-Sectional Studies , Female , Germany , Humans , Male , Voice Disorders/physiopathology , Voice Disorders/psychology
17.
J Voice ; 25(5): 576-90, 2011 Sep.
Article in English | MEDLINE | ID: mdl-20728308

ABSTRACT

OBJECTIVES: The aim of this study was to look for visual subjective and objective parameters of vocal fold dynamics being capable of differentiating healthy from pathologic voices in daily clinical practice applying endoscopic high-speed digital imaging (HSI). STUDY DESIGN AND METHODS: Four hundred ninety-six datasets containing 80 healthy and 416 pathologic subjects (232 functional dysphonia (FD), 13 bilateral, and 171 unilateral vocal fold nerve paralysis) were analyzed retrospectively. Videos at 4000Hz (256×256 pixel) were recorded during sustained phonation. Subjective parameters were visually evaluated and complemented by an analysis of objective parameters. Visual subjective parameters were mucosal wave, glottal closure type, glottal closure insufficiency (GI), asymmetries of the vocal folds, and phonovibrogram (PVG) symmetry. After image segmentation, objective parameters were computed: closed quotient, perturbation measures (PMs) of glottal area, and left-right asymmetry values. RESULTS: HSI evaluation enabled to distinguish healthy from pathologic voices. For visual subjective parameters, GI, symmetrical behavior, and PVG symmetry exhibited statistical significant differences. For 95% of the data, objective parameters could be computed. Among objective parameters, closed quotient, jitter, shimmer, harmonic-to-noise ratio, and signal-to-noise ratio for the glottal area function differentiated statistically significant normal from pathologic voices. Applying linear discriminant analysis by combining visual subjective and objective parameters, accurate classifications were made for 63.2% of the female and 87.5% of the male group for the three-class problem (healthy, FD, and unilateral vocal fold nerve paralysis). CONCLUSION: Actual acoustically applied PMs can be transferred to clinical beneficial HSI analysis. Combining visual subjective and objective basic parameters succeeds in differentiating pathologic from healthy voices. The presented evaluation can easily be included into everyday clinical practice. However, further research is needed to broaden our understanding of the variability within and across healthy and pathologic vocal fold vibrations for diagnosing voice disorders and therapy control.


Subject(s)
Dysphonia/diagnosis , Dysphonia/physiopathology , Vibration , Vocal Cords/physiology , Voice Quality/physiology , Adolescent , Adult , Aged , Databases, Factual , Diagnosis, Differential , Female , Humans , Image Processing, Computer-Assisted , Laryngoscopy , Male , Middle Aged , Retrospective Studies , Vocal Cord Paralysis/diagnosis , Vocal Cord Paralysis/physiopathology , Young Adult
18.
IEEE Trans Med Imaging ; 29(12): 1979-91, 2010 Dec.
Article in English | MEDLINE | ID: mdl-21118756

ABSTRACT

The ability to communicate with our voice can be regarded as the concatenation of the two processes "phonation" and "modulation." These take place in the larynx and palatal and oral region, respectively. During phonation the audible primary voice signal is created by mutual reaction of vocal folds with the exhaled air stream of the lungs. The underlying interactions of masses, fluids and acoustics have yet to be identified and understood. One part of the primary signal's acoustical source are vortex induced vibrations, as e.g., created by the Coandaeffect in the air stream. The development of these vorteces is determined by the shape and 3-D movements of the vocal folds in the larynx. Current clinical in vivo research methods for vocal folds do not deliver data of satisfactory quality for fundamental research, e.g., an endoscope is limited to 2-D image information. Based hereupon, a few improved methods have been presented, however delivering only selective 3-D information, either for a single point or a line. This stands in contrast to the 3-D motions of the entire vocal fold surface. More complex imaging methods, such as MRI, do not deliver information in real-time. Thus, it is necessary to develop an easily applicable, more improved examination method, which allows for 3-D data of the vocal folds surfaces to be obtained. We present a method to calibrate a 3-D reconstruction setup including a laser projection system and a high-speed camera. The setup is designed with miniaturization and an in vivo application in mind. The laser projection system generates a divergent grid of 196 laser dots by diffraction gratings. It is calibrated with a planar calibration target through planar homography. In general, the setup allows to reconstruct the topology of a surface at high frame rates (up to 4000 frames per second) and in uncontrollable environments, as e.g., given by the lighting situation (little to no ambient light) and varying texture (e.g., varying grade of reflection) in the human larynx. In particular, this system measures the 3-D vocal fold surface dynamics during phonation. Applied to synthetic data, the calibration is shown to be robust (error approximately 0.5 µm) regarding noise and systematic errors. Experimental data gained with a linear z -stage proved that the system reconstructs the 3-D coordinates of points with an error at approximately 15 µm. The method was applied exemplarily to reconstruct porcine and artificial vocal folds' surfaces during phonation. Local differences such as asymmetry between left and right fold dynamics, as well as global parameters, such as opening and closing speed and maximum displacements, were identified and quantified.


Subject(s)
Image Processing, Computer-Assisted/methods , Kymography/methods , Phonation/physiology , Signal Processing, Computer-Assisted , Vocal Cords/anatomy & histology , Vocal Cords/physiology , Algorithms , Animals , Calibration , Endoscopy/methods , Humans , Models, Biological , Speech Acoustics , Surface Properties , Swine , Video Recording/methods
19.
J Acoust Soc Am ; 128(5): EL347-53, 2010 Nov.
Article in English | MEDLINE | ID: mdl-21110550

ABSTRACT

In this work a detection algorithm for mucosal wave propagation is presented. By incorporating physiological knowledge of mucosal wave properties and taking the segmented lateral movement of both vocal fold edges as a basis, the spatio-temporal position of the traveling mucosal wave is identified and quantitatively captured. The course of mucosal wave propagation can be successfully detected and analyzed with regard to discriminating different types of mucosal wave activity (in terms of spread velocity and symmetry). The preliminary results obtained for six exemplary laryngeal high-speed recordings are promising and demonstrate the potential of the proposed detection and objective description approach.


Subject(s)
Laryngeal Mucosa/physiology , Models, Biological , Speech Acoustics , Vocal Cords/physiology , Voice/physiology , Algorithms , Humans
20.
Eur Arch Otorhinolaryngol ; 267(8): 1261-71, 2010 Aug.
Article in English | MEDLINE | ID: mdl-20567980

ABSTRACT

Within this study a retrospective analysis of clinical voice perturbation measures, Dysphonia Severity Index and subjective perceived hoarseness was performed to determine their value under clinical aspects. The study included the data of 580 healthy and 1,700 pathologic voices, which were investigated under the following aspects. The relevant parameters were identified and their interrelation determined. Group differences between healthy and pathologic voices were figured out and investigated if voice quality measures allowed an automatic diagnosis of voice disorders. The analysis revealed significant changes between the clinical groups, which indicate the diagnostic relevance of voice quality measures. However, an individual diagnosis of the underlying voice disorder failed due to a vast spread of the parameter values within the respective groups. Classification accuracies of 75-90% were achieved. The high misclassification rate of up to 25% implied that in voice disorder diagnosis, the individual interpretation of the parameter values has to be done carefully.


Subject(s)
Diagnosis, Computer-Assisted , Dysphonia/diagnosis , Hoarseness/diagnosis , Sound Spectrography , Voice Quality , Adolescent , Adult , Aged , Aged, 80 and over , Child , Dysphonia/classification , Dysphonia/etiology , Female , Hoarseness/classification , Hoarseness/etiology , Humans , Male , Middle Aged , Young Adult
SELECTION OF CITATIONS
SEARCH DETAIL
...