Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 11 de 11
Filtrar
1.
Logoped Phoniatr Vocol ; 40(1): 24-9, 2015 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-25019410

RESUMO

This article is a compilation of own research performed during the European COoperation in Science and Technology (COST) action 2103: 'Advance Voice Function Assessment', an initiative of voice and speech processing teams consisting of physicists, engineers, and clinicians. This manuscript concerns analyzing largely irregular voicing types, namely substitution voicing (SV) and adductor spasmodic dysphonia (AdSD). A specific perceptual rating scale (IINFVo) was developed, and the Auditory Model Based Pitch Extractor (AMPEX), a piece of software that automatically analyses running speech and generates pitch values in background noise, was applied. The IINFVo perceptual rating scale has been shown to be useful in evaluating SV. The analysis of strongly irregular voices stimulated a modification of the European Laryngological Society's assessment protocol which was originally designed for the common types of (less severe) dysphonia. Acoustic analysis with AMPEX demonstrates that the most informative features are, for SV, the voicing-related acoustic features and, for AdSD, the perturbation measures. Poor correlations between self-assessment and acoustic and perceptual dimensions in the assessment of highly irregular voices argue for a multidimensional approach.


Assuntos
Acústica , Disfonia/diagnóstico , Acústica da Fala , Medida da Produção da Fala/métodos , Qualidade da Voz , Comportamento Cooperativo , Autoavaliação Diagnóstica , Disfonia/fisiopatologia , Humanos , Comunicação Interdisciplinar , Valor Preditivo dos Testes , Índice de Gravidade de Doença , Processamento de Sinais Assistido por Computador , Software , Percepção da Fala
2.
IEEE Trans Cybern ; 44(12): 2288-301, 2014 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-25415938

RESUMO

Embodied music cognition stresses the role of the human body as mediator for the encoding and decoding of musical expression. In this paper, we set up a low dimensional functional model that accounts for 70% of the variability in the expressive body movement responses to music. With the functional principal component analysis, we modeled individual body movements as a linear combination of a group average and a number of eigenfunctions. The group average and the eigenfunctions are common to all subjects and make up what we call the commonalities. An individual performance is then characterized by a set of scores (the individualities), one score per eigenfunction. The model is based on experimental data which finds high levels of coherence/consistency between participants when grouped according to musical education. This shows an ontogenetic effect. Participants without formal musical education focus on the torso for the expression of basic musical structure (tempo). Musically trained participants decode additional structural elements in the music and focus on body parts having more degrees of freedom (such as the hands). Our results confirm earlier studies that different body parts move differently along with the music.


Assuntos
Percepção Auditiva/fisiologia , Modelos Biológicos , Movimento/fisiologia , Música/psicologia , Comunicação não Verbal/fisiologia , Comunicação não Verbal/psicologia , Estimulação Acústica/métodos , Adulto , Afeto , Simulação por Computador , Feminino , Humanos , Interpretação de Imagem Assistida por Computador/métodos , Masculino , Reconhecimento Automatizado de Padrão/métodos , Imagem Corporal Total/métodos
3.
Folia Phoniatr Logop ; 66(6): 219-26, 2014.
Artigo em Inglês | MEDLINE | ID: mdl-25659422

RESUMO

OBJECTIVE: Automatic intelligibility assessment using automatic speech recognition is usually language specific. In this study, a language-independent approach is proposed. It uses models that are trained with Flemish speech, and it is applied to assess chronically hoarse German speakers. The research questions are here: is it possible to construct suitable acoustic features that generalize to other languages and a speech disorder, and is the generated model for intelligibility also suitable for specific subtypes of that disorder, i.e. functional and organic dysphonia? PATIENTS AND METHODS: 73 German-speaking persons with chronic hoarseness read the text 'Der Nordwind und die Sonne'. Perceptual intelligibility scores were used as ground truth during the training of an automatic model that converts speaker level acoustic measurements into intelligibility scores. Cross-validation is used to assess model performance. RESULTS: The interrater agreement for all patients (n = 73) and for the functional and organic dysphonia subgroups (n = 45 and n = 24) are r = 0.82, r = 0.83 and r = 0.75, respectively. The automatic assessment based on phonologically based acoustic models revealed correlations between perceptual and automatic intelligibility ratings of r = 0.79 (all patients), r = 0.78 (functional dysphonia) and r = 0.80 (organic dysphonia). CONCLUSION: The automatic, objective measurement of intelligibility is a valuable instrument in an evidence-based clinical practice.


Assuntos
Rouquidão/diagnóstico , Rouquidão/psicologia , Idioma , Inteligibilidade da Fala , Interface para o Reconhecimento da Fala , Adulto , Idoso , Idoso de 80 Anos ou mais , Doença Crônica , Disfonia/diagnóstico , Feminino , Rouquidão/etiologia , Humanos , Masculino , Pessoa de Meia-Idade , Fonética , Acústica da Fala , Adulto Jovem
4.
PLoS One ; 8(7): e67932, 2013.
Artigo em Inglês | MEDLINE | ID: mdl-23874469

RESUMO

Inspired by a theory of embodied music cognition, we investigate whether music can entrain the speed of beat synchronized walking. If human walking is in synchrony with the beat and all musical stimuli have the same duration and the same tempo, then differences in walking speed can only be the result of music-induced differences in stride length, thus reflecting the vigor or physical strength of the movement. Participants walked in an open field in synchrony with the beat of 52 different musical stimuli all having a tempo of 130 beats per minute and a meter of 4 beats. The walking speed was measured as the walked distance during a time interval of 30 seconds. The results reveal that some music is 'activating' in the sense that it increases the speed, and some music is 'relaxing' in the sense that it decreases the speed, compared to the spontaneous walked speed in response to metronome stimuli. Participants are consistent in their observation of qualitative differences between the relaxing and activating musical stimuli. Using regression analysis, it was possible to set up a predictive model using only four sonic features that explain 60% of the variance. The sonic features capture variation in loudness and pitch patterns at periods of three, four and six beats, suggesting that expressive patterns in music are responsible for the effect. The mechanism may be attributed to an attentional shift, a subliminal audio-motor entrainment mechanism, or an arousal effect, but further study is needed to figure this out. Overall, the study supports the hypothesis that recurrent patterns of fluctuation affecting the binary meter strength of the music may entrain the vigor of the movement. The study opens up new perspectives for understanding the relationship between entrainment and expressiveness, with the possibility to develop applications that can be used in domains such as sports and physical rehabilitation.


Assuntos
Música/psicologia , Caminhada/psicologia , Aceleração , Estimulação Acústica/métodos , Adulto , Percepção Auditiva/fisiologia , Feminino , Humanos , Masculino , Pessoa de Meia-Idade , Relaxamento/psicologia , Fatores de Tempo , Caminhada/fisiologia , Adulto Jovem
5.
IEEE Trans Syst Man Cybern B Cybern ; 41(2): 330-40, 2011 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-20699214

RESUMO

When learning a support vector machine (SVM) from a set of labeled development patterns, the ultimate goal is to get a classifier attaining a low error rate on new patterns. This so-called generalization ability obviously depends on the choices of the learning parameters that control the learning process. Model selection is the method for identifying appropriate values for these parameters. In this paper, a novel model selection method for SVMs with a Gaussian kernel is proposed. Its aim is to find suitable values for the kernel parameter γ and the cost parameter C with a minimum amount of central processing unit time. The determination of the kernel parameter is based on the argument that, for most patterns, the decision function of the SVM should consist of a sufficiently large number of significant contributions. A unique property of the proposed method is that it retrieves the kernel parameter as a simple analytical function of the dimensionality of the feature space and the dispersion of the classes in that space. An experimental evaluation on a test bed of 17 classification problems has shown that the new method favorably competes with two recently published methods: the classification of new patterns is equally good, but the computational effort to identify the learning parameters is substantially lower.


Assuntos
Algoritmos , Inteligência Artificial , Técnicas de Apoio para a Decisão , Modelos Estatísticos , Modelos Teóricos , Reconhecimento Automatizado de Padrão/métodos , Simulação por Computador , Distribuição Normal
6.
Eur Arch Otorhinolaryngol ; 266(12): 1915-22, 2009 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-19866529

RESUMO

In this article, we investigate whether (1) the IINFVo (Impression, Intelligibility, Noise, Fluency and Voicing) perceptual rating scale and (2) the AMPEX (Auditory Model Based Pitch Extractor) acoustical analysis are suitable for evaluating adductor spasmodic dysphonia (AdSD). Voice recordings of 12 patients were analysed. The inter-rater and intra-rater consistency showed highly significant correlations for the IINFVo rating scale, with the exception of the parameter Noise. AMPEX reliably analyses vowels (correlation between PUVF (percentage of frames with unreliable F0/voicing 0.748), running speech (correlation between PVF (percentage of voiced frames)/voicing 0.699) and syllables. Correlations between IINFVo and AMPEX range from 0.608 to 0.818, except for noise. This study indicates that IINFVo and AMPEX could be robust and complementary assessment tools for the evaluation of AdSD. Both the tools provide us with the valuable information about voice quality, stability of F0 (fundamental frequency) and specific dimensions controlling the transitions between voiced and unvoiced segments.


Assuntos
Disfonia/diagnóstico , Percepção da Fala/fisiologia , Qualidade da Voz/fisiologia , Toxinas Botulínicas Tipo A/administração & dosagem , Estudos Transversais , Diagnóstico Diferencial , Disfonia/tratamento farmacológico , Disfonia/fisiopatologia , Feminino , Seguimentos , Humanos , Injeções , Masculino , Pessoa de Meia-Idade , Fármacos Neuromusculares/administração & dosagem , Projetos Piloto , Prognóstico , Inteligibilidade da Fala/fisiologia
7.
IEEE Trans Biomed Eng ; 56(3): 706-17, 2009 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-19272875

RESUMO

Genetic absence epilepsy rats from Strasbourg are a strain of Wistar rats in which all animals exhibit spontaneous occurrences of spike and wave discharges (SWDs) in the EEG. In this paper, we propose a novel method for the detection of SWDs, based on the key observation that SWDs are quasi-periodic signals. The method consists of the following steps: 1) calculation of the spectrogram; 2) estimation of the background spectrum and detection of stimulation artifacts; 3) harmonic analysis with continuity analysis to estimate the fundamental frequency; and 4) classification based on the percentage of power in the harmonics to the total power of the spectrum. We evaluated the performance of the novel detection method and six SWD/seizure detection methods from literature on a large database of labeled EEG data consisting of two datasets running to a total duration of more than 26 days of recording. The method outperforms all tested SWD/seizure detection methods, showing a sensitivity and selectivity of 96% and 97%, respectively, on the first test set, and a sensitivity and selectivity of 94% and 92%, respectively, on the second test set. The detection performance is less satisfactory (as for all other methods) for EEG fragments showing more irregular and less periodic SWDs.


Assuntos
Algoritmos , Eletroencefalografia , Epilepsia Tipo Ausência/fisiopatologia , Reconhecimento Automatizado de Padrão/métodos , Processamento de Sinais Assistido por Computador , Animais , Artefatos , Modelos Animais de Doenças , Masculino , Ratos , Ratos Wistar , Sensibilidade e Especificidade
8.
Int J Lang Commun Disord ; 44(5): 716-30, 2009.
Artigo em Inglês | MEDLINE | ID: mdl-18821111

RESUMO

BACKGROUND: Currently, clinicians mainly rely on perceptual judgements to assess intelligibility of dysarthric speech. Although often highly reliable, this procedure is subjective with a lot of intrinsic variables. Therefore, certain benefits can be expected from a speech technology-based intelligibility assessment. Previous attempts to develop an automated intelligibility assessment mainly relied on automatic speech recognition (ASR) systems that were trained to recognize the speech of persons without known impairments. In this paper automatic speech alignment (ASA) systems are used instead. In addition, previous attempts only made use of phonemic features (PMF). However, since articulation is an important contributing factor to intelligibility of dysarthric speech and since phonological features (PLF) are shared by multiple phonemes, phonological features may be more appropriate to characterize and identify dysarthric phonemes. AIMS: To investigate the reliability of objective phoneme intelligibility scores obtained by three types of intelligibility models: models using only phonemic features (yielded by an automated speech aligner) (PMF models), models using only phonological features (PLF models), and models using a combination of phonemic and phonological features (PMF + PLF models). METHODS & PROCEDURES: Correlations were calculated between the objective phoneme intelligibility scores of 60 dysarthric speakers and the corresponding perceptual phoneme intelligibility scores obtained by a standardized perceptual phoneme intelligibility assessment. OUTCOMES & RESULTS: The correlations between the objective and perceptual intelligibility scores range from 0.793 for the PMF models, over 0.828 for PLF models to 0.943 for PMF + PLF models. The features selected to obtain such high correlations can be divided into six main subgroups: (1) vowel-related phonemic and phonological features, (2) lateral-related features, (3) silence-related features, (4) fricative-related features, (5) velar-related features and (6) plosive-related features. CONCLUSIONS & IMPLICATIONS: The phoneme intelligibility scores of dysarthric speakers obtained by the three investigated intelligibility model types are reliable. The highest correlation between the perceptual and objective intelligibility scores was found for models combining phonemic and phonological features. The intelligibility scoring system is now ready to be implemented in a clinical tool.


Assuntos
Diagnóstico por Computador/métodos , Disartria/psicologia , Inteligibilidade da Fala , Disartria/diagnóstico , Disartria/etiologia , Humanos , Modelos Psicológicos , Fonética , Reprodutibilidade dos Testes , Medida da Produção da Fala/métodos
9.
Eur Arch Otorhinolaryngol ; 263(5): 435-9, 2006 May.
Artigo em Inglês | MEDLINE | ID: mdl-16404623

RESUMO

In this paper, an experimental study of inter-judge consistency for the different dimensions of a recently proposed new scale for the rating of substitution voices is presented. The IINFVo rating scale tries to score five parameters, namely impression, intelligibility, noise, fluency and voicing. Each parameter is scored between 0 (very good substitution voicing) and 10 (very deviant substitution voicing) on a visual analogue scale. Inter-judge consistencies were measured among semi-professional as well as among professional jury members. The consistencies among semi-professionals, expressed as Pearson correlation coefficients, ranged from moderate to good (0.57-0.68), whereas those among professionals were good to excellent (0.82-0.87) and compared favourably to consistency figures published for traditional perceptual evaluation scales such as the GRBAS scale for laryngeal dysphonia. Since there is a strong correlation between the scores of impression and intelligibility, and since intelligibility is hard to score by non-native listeners, we suggest taking the mean of the two scores as the "impression" of a modified dimensional INFVo rating scale. Our experiments demonstrate that the INFVo rating scale has good potential to become a routine perceptual evaluation method in a multidimensional assessment protocol for substitution voicing.


Assuntos
Inteligibilidade da Fala , Distúrbios da Voz/diagnóstico , Qualidade da Voz , Humanos , Variações Dependentes do Observador , Reprodutibilidade dos Testes , Fala , Voz Alaríngea , Voz
10.
Eur Arch Otorhinolaryngol ; 261(10): 541-7, 2004 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-14727123

RESUMO

This paper describes our first attempts to develop a method for the objective assessment of quality in substitution voices. The objective analysis deals with acoustic parameters characterising short voice and speech samples like a sequence of isolated vowels, a sequence of VCV and CVCVCV syllables, a short sentence, etc. A database of 113 registrations from 68 patients (53 total laryngectomy patients with tracheo-esophageal speech, 14 total laryngectomy patients with esophageal speech and 5 patients with partial frontolateral laryngectomy) and 6 registrations from healthy control persons was collected. Each registration consisted of seven speech utterances and was subjected to an acoustic analysis as well as to a perceptual evaluation, the latter involving eight parameters like "overall impression", "tonicity", etc. Since the goal of our work is to find out the best acoustical measurement for supporting perception and making it precise, it seemed logical to strive for a perceptually based acoustic analysis. We therefore performed the analysis by means of a peripheral auditory model with a built-in fundamental frequency (pitch) extractor. From the frame-level outputs (a frame is 10 ms) of the analyser, global objective parameters, such as (1) the percentage of voiced frames, (2) the average voicing evidence, (3) the voicing length distribution and (4) the fundamental frequency jitter, were computed for the different speech utterances. So as to reduce the parameter variability arising from the nature of the speech utterances (e.g., the presence of pauses in the signal, errors caused by the pitch extractor, etc.), the objective parameters were computed using non-standard averaging schemes involving energy weighting and frame selection. A statistical analysis of the objective parameters confirms that the quality of tracheo-esophageal speech is superior to that of esophageal speech, but inferior to that of normal speech and speech with the preservation of one vocal fold. Correlations between the objective parameters and the perceptual parameters are moderate.


Assuntos
Acústica da Fala , Voz Esofágica , Distúrbios da Voz/diagnóstico , Qualidade da Voz , Seguimentos , Humanos , Laringectomia , Modelos Biológicos , Fonética , Índice de Gravidade de Doença , Inteligibilidade da Fala
11.
Eur Arch Otorhinolaryngol ; 261(8): 423-8, 2004 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-14685878

RESUMO

We tested the Voice Handicap Index (VHI) in 45 patients with substitution voicing (that is, without the use of two vocal folds), the majority of them using tracheo-oesophageal speech. We introduced a corrected VHI score (VHI(corr)) whose values are in the range from 0 to 100 and which can be expressed as a percentage. As such, the VHI(corr) is a handy and transparent tool, and it seems to be suited for representing the handicap caused by the voice disorder when some items are unanswered as experienced in patients with substitution voicing. Interestingly, our data reveal that the voice handicap severity of this particular category of patients is (1) moderate and in the range of "common" dysphonia and (2) not affected by additional radiotherapy. It seems that the E domain is overstated due to the number of problematic items in the P and F domains.


Assuntos
Neoplasias Laríngeas/cirurgia , Laringectomia , Complicações Pós-Operatórias , Voz Alaríngea , Distúrbios da Voz/diagnóstico , Distúrbios da Voz/etiologia , Adulto , Idoso , Idoso de 80 Anos ou mais , Humanos , Pessoa de Meia-Idade , Índice de Gravidade de Doença
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...