Pesquisa | Portal Regional da BVS (teste)

Phonetics exercises using the Alvin experiment-control software.

Hillenbrand, James M; Gayvert, Robert T; Clark, Michael J.

J Speech Lang Hear Res ; 58(2): 171-84, 2015 Apr.

Artigo em Inglês | MEDLINE | ID: mdl-25480760

RESUMO

PURPOSE: Exercises are described that were designed to provide practice in phonetic transcription for students taking an introductory phonetics course. The goal was to allow instructors to offload much of the drill that would otherwise need to be covered in class or handled with paper-and-pencil tasks using text rather than speech as input. METHOD: The exercises were developed using Alvin, a general-purpose software package for experiment design and control. The simplest exercises help students learn sound-symbol associations. For example, a vowel-transcription exercise presents listeners with consonant-vowel-consonant syllables on each trial; students are asked to choose among buttons labeled with phonetic symbols for 12 vowels. Several word-transcription exercises are included in which students hear a word and are asked to enter a phonetic transcription. Immediate feedback is provided for all of the exercises. An explanation of the methods that are used to create exercises is provided. RESULTS: Although no formal evaluation was conducted, comments on course evaluations suggest that most students found the exercises to be useful. CONCLUSIONS: Exercises were developed for use in an introductory phonetics course. The exercises can be used in their current form, they can be modified to suit individual needs, or new exercises can be developed.

Assuntos

Fonética , Software , Patologia da Fala e Linguagem/educação , Ensino , Humanos , Adulto Jovem

Perception of sinewave vowels.

Hillenbrand, James M; Clark, Michael J; Baer, Carter A.

J Acoust Soc Am ; 129(6): 3991-4000, 2011 Jun.

Artigo em Inglês | MEDLINE | ID: mdl-21682420

RESUMO

There is a significant body of research examining the intelligibility of sinusoidal replicas of natural speech. Discussion has followed about what the sinewave speech phenomenon might imply about the mechanisms underlying phonetic recognition. However, most of this work has been conducted using sentence material, making it unclear what the contributions are of listeners' use of linguistic constraints versus lower level phonetic mechanisms. This study was designed to measure vowel intelligibility using sinusoidal replicas of naturally spoken vowels. The sinusoidal signals were modeled after 300 /hVd/ syllables spoken by men, women, and children. Students enrolled in an introductory phonetics course served as listeners. Recognition rates for the sinusoidal vowels averaged 55%, which is much lower than the â¼95% intelligibility of the original signals. Attempts to improve performance using three different training methods met with modest success, with post-training recognition rates rising by â¼5-11 percentage points. Follow-up work showed that more extensive training produced further improvements, with performance leveling off at â¼73%-74%. Finally, modeling work showed that a fairly simple pattern-matching algorithm trained on naturally spoken vowels classified sinewave vowels with 78.3% accuracy, showing that the sinewave speech phenomenon does not necessarily rule out template matching as a mechanism underlying phonetic recognition.

Assuntos

Acústica da Fala , Inteligibilidade da Fala , Percepção da Fala , Estimulação Acústica , Adulto , Algoritmos , Análise de Variância , Audiometria de Tons Puros , Audiometria da Fala , Criança , Feminino , Humanos , Masculino , Reconhecimento Fisiológico de Modelo , Reconhecimento Psicológico , Processamento de Sinais Assistido por Computador , Espectrografia do Som

The role of f (0) and formant frequencies in distinguishing the voices of men and women.

Hillenbrand, James M; Clark, Michael J.

Atten Percept Psychophys ; 71(5): 1150-66, 2009 Jul.

Artigo em Inglês | MEDLINE | ID: mdl-19525544

RESUMO

The purpose of the present study was to determine the contributions of fundamental frequency (f (0)) and formants in cuing the distinction between men's and women's voices. A source-filter synthesizer was used to create four versions of 25 sentences spoken by men: (1) unmodified synthesis, (2) f (0) only shifted up toward values typical of women, (3) formants only shifted up toward values typical of women, and (4) both f (0) and formants shifted up. Identical methods were used to generate four corresponding versions of 25 sentences spoken by women, but with downward shifts. Listening tests showed that (1) shifting both f (0) and formants was usually effective (~82%) in changing the perceived sex of the utterance, and (2) shifting either f (0) or formants alone was usually ineffective in changing the perceived sex. Both f (0) and formants are apparently needed to specify speaker sex, though even together these cues are not entirely effective. Results also suggested that f (0) is somewhat more important than formants. A second experiment used the same methods, but isolated /hVd/ syllables were used as test signals. Results were broadly similar, with the important exception that, on average, the syllables were more likely to shift perceived talker sex with shifts in f (0) and/or formants.

Assuntos

Fonética , Caracteres Sexuais , Espectrografia do Som , Acústica da Fala , Percepção da Fala , Qualidade da Voz , Feminino , Humanos , Masculino

Speech perception based on spectral peaks versus spectral shape.

Hillenbrand, James M; Houde, Robert A; Gayvert, Robert T.

J Acoust Soc Am ; 119(6): 4041-54, 2006 Jun.

Artigo em Inglês | MEDLINE | ID: mdl-16838546

RESUMO

This study was designed to measure the relative contributions to speech intelligibility of spectral envelope peaks (including, but not limited to formants) versus the detailed shape of the spectral envelope. The problem was addressed by asking listeners to identify sentences and nonsense syllables that were generated by two structurally identical source-filter synthesizers, one of which constructs the filter function based on the detailed spectral envelope shape while the other constructs the filter function using a purposely coarse estimate that is based entirely on the distribution of peaks in the envelope. Viewed in the broadest terms the results showed that nearly as much speech information is conveyed by the peaks-only method as by the detail-preserving method. Just as clearly, however, every test showed some measurable advantage for spectral detail, although the differences were not large in absolute terms.

Assuntos

Estimulação Acústica/métodos , Fonética , Acústica da Fala , Percepção da Fala/fisiologia , Adulto , Análise de Variância , Criança , Feminino , Humanos , Linguística , Masculino , Espectrografia do Som , Inteligibilidade da Fala

Open source software for experiment design and control.

Hillenbrand, James M; Gayvert, Robert T.

J Speech Lang Hear Res ; 48(1): 45-60, 2005 Feb.

Artigo em Inglês | MEDLINE | ID: mdl-15938059

RESUMO

The purpose of this paper is to describe a software package that can be used for performing such routine tasks as controlling listening experiments (e.g., simple labeling, discrimination, sentence intelligibility, and magnitude estimation), recording responses and response latencies, analyzing and plotting the results of those experiments, displaying instructions, and making scripted audio-recordings. The software runs under Windows and is controlled by creating text files that allow the experimenter to specify key features of the experiment such as the stimuli that are to be presented, the randomization scheme, interstimulus and intertrial intervals, the format of the output file, and the layout of response alternatives on the screen. Although the software was developed primarily with speech-perception and psychoacoustics research in mind, it has uses in other areas as well, such as written or auditory word recognition, written or auditory sentence processing, and visual perception.

Assuntos

Software , Percepção da Fala , Humanos , Reconhecimento Psicológico , Percepção Visual , Vocabulário

A narrow band pattern-matching model of vowel perception.

Hillenbrand, James M; Houde, Robert A.

J Acoust Soc Am ; 113(2): 1044-55, 2003 Feb.

Artigo em Inglês | MEDLINE | ID: mdl-12597197

RESUMO

The purpose of this paper is to propose and evaluate a new model of vowel perception which assumes that vowel identity is recognized by a template-matching process involving the comparison of narrow band input spectra with a set of smoothed spectral-shape templates that are learned through ordinary exposure to speech. In the present simulation of this process, the input spectra are computed over a sufficiently long window to resolve individual harmonics of voiced speech. Prior to template creation and pattern matching, the narrow band spectra are amplitude equalized by a spectrum-level normalization process, and the information-bearing spectral peaks are enhanced by a "flooring" procedure that zeroes out spectral values below a threshold function consisting of a center-weighted running average of spectral amplitudes. Templates for each vowel category are created simply by averaging the narrow band spectra of like vowels spoken by a panel of talkers. In the present implementation, separate templates are used for men, women, and children. The pattern matching is implemented with a simple city-block distance measure given by the sum of the channel-by-channel differences between the narrow band input spectrum (level-equalized and floored) and each vowel template. Spectral movement is taken into account by computing the distance measure at several points throughout the course of the vowel. The input spectrum is assigned to the vowel template that results in the smallest difference accumulated over the sequence of spectral slices. The model was evaluated using a large database consisting of 12 vowels in /hVd/ context spoken by 45 men, 48 women, and 46 children. The narrow band model classified vowels in this database with a degree of accuracy (91.4%) approaching that of human listeners.

Assuntos

Fonética , Espectrografia do Som , Acústica da Fala , Percepção da Fala , Adulto , Criança , Feminino , Análise de Fourier , Humanos , Masculino , Valores de Referência , Espectrografia do Som/estatística & dados numéricos

Speech synthesis using damped sinusoids.

Hillenbrand, James M; Houde, Robert A.

J Speech Lang Hear Res ; 45(4): 639-50, 2002 Aug.

Artigo em Inglês | MEDLINE | ID: mdl-12199395

RESUMO

A speech synthesizer was developed that operates by summing exponentially damped sinusoids at frequencies and amplitudes corresponding to peaks derived from the spectrum envelope of the speech signal. The spectrum analysis begins with the calculation of a smoothed Fourier spectrum. A masking threshold is then computed for each frame as the running average of spectral amplitudes over an 800-Hz window. In a rough simulation of lateral suppression, the running average is then subtracted from the smoothed spectrum (with negative spectral values set to zero), producing a masked spectrum. The signal is resynthesized by summing exponentially damped sinusoids at frequencies corresponding to peaks in the masked spectra. If a periodicity measure indicates that a given analysis frame is voiced, the damped sinusoids are pulsed at a rate corresponding to the measured fundamental period. For unvoiced speech, the damped sinusoids are pulsed on and off at random intervals. A perceptual evaluation of speech produced by the damped sinewave synthesizer showed excellent sentence intelligibility, excellent intelligibility for vowels in /hVd/ syllables, and fair intelligibility for consonants in CV nonsense syllables.

Assuntos

Auxiliares de Comunicação para Pessoas com Deficiência , Voz Alaríngea , Humanos , Fonética , Espectrografia do Som , Inteligibilidade da Fala , Percepção da Fala

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA