Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 15 de 15
Filter
Add more filters










Publication year range
1.
Hear Res ; 142(1-2): 102-12, 2000 Apr.
Article in English | MEDLINE | ID: mdl-10748333

ABSTRACT

The auditory efferent nerve is a feedback pathway that originates in the brainstem and projects to the inner ear. Although the anatomy and physiology of efferents have been rather thoroughly described, their functional roles in auditory perception are still not clear. Here, we report data in six human subjects who had undergone vestibular neurectomy, during which their efferent nerves were also presumably severed. The surgery had alleviated these subjects' vertigo but also resulted in mild to moderate hearing loss. We designed our experiments with a focus on the possible role of efferents in anti-masking. Consistent with previous studies, we found little effects of vestibular neurectomy on pure-tone detection and discrimination in quiet. However, we noted several new findings in all subjects tested. Efferent section increased loudness sensation (one subject), reduced overshoot effect (five subjects), accentuated 'the midlevel hump' in forward masking (two subjects), and worsened intensity discrimination in noise (four subjects). Poorer speech in noise recognition was also observed in the surgery ear than the non-surgery ear in three out of four subjects tested, but this finding was confounded by hearing loss. The present results suggest an active role of efferents in auditory perception in noise.


Subject(s)
Auditory Perception , Vestibular Nerve/surgery , Adult , Aged , Auditory Threshold , Discrimination, Psychological , Female , Humans , Loudness Perception , Male , Middle Aged , Noise , Perceptual Masking , Postoperative Period , Speech Perception , Vertigo/surgery
2.
J Acoust Soc Am ; 104(1): 505-10, 1998 Jul.
Article in English | MEDLINE | ID: mdl-9670541

ABSTRACT

Recent studies have shown that temporal waveform envelope cues can provide significant information for English speech recognition. This study investigated the use of temporal envelope cues in a tonal language: Mandarin Chinese. In this study, the speech was divided into several frequency analysis bands; the amplitude envelope was extracted from each band by half-wave rectification and low-pass filtering and was used to modulate a noise of the same bandwidth as the analysis band. These manipulations preserved temporal and amplitude cues in each frequency band, but removed the spectral detail within each band. Chinese vowels, consonants, tones and sentences were identified by 12 native Chinese-speaking listeners with 1, 2, 3, and 4 noise bands. The results showed that the recognition score of vowels, consonants, and sentences increased monotonically with the number of bands, a pattern similar to that observed in English speech recognition. In contrast, tones were consistently recognized at about 80% correct level, independent of the number of bands. This high level of tone recognition produced a significant difference in the open-set sentence recognition between Chinese (11.0%) and English (2.9%) for the one-band condition where no spectral information was available. The data also revealed that, with primarily temporal cues, the falling-rising tone (tone 3) and the falling tone (tone 4) were more easily recognized than the flat tone (tone 1) and the rising tone (tone 2). This differential pattern in tone recognition resulted in a similar pattern in word recognition: words having either tone 3 or 4 were more likely to be recognized while words having tone 1 and 2 were not. The quantitative role of tones in Chinese speech recognition was further explored using a power-function model and found to play a significant role in relating phoneme recognition to sentence recognition.


Subject(s)
Cues , Speech Perception/physiology , Speech/physiology , Adult , China , Female , Humans , Male , Models, Biological , Phonetics
3.
J Acoust Soc Am ; 95(2): 1085-99, 1994 Feb.
Article in English | MEDLINE | ID: mdl-8132902

ABSTRACT

A large set of sentence materials, chosen for their uniformity in length and representation of natural speech, has been developed for the measurement of sentence speech reception thresholds (sSRTs). The mean-squared level of each digitally recorded sentence was adjusted to equate intelligibility when presented in spectrally matched noise to normal-hearing listeners. These materials were cast into 25 phonemically balanced lists of ten sentences for adaptive measurement of sentence sSRTs. The 95% confidence interval for these measurements is +/- 2.98 dB for sSRTs in quiet and +/- 2.41 dB for sSRTs in noise, as defined by the variability of repeated measures with different lists. Average sSRTs in quiet were 23.91 dB(A). Average sSRTs in 72 dB(A) noise were 69.08 dB(A), or -2.92 dB signal/noise ratio. Low-pass filtering increased sSRTs slightly in quiet and noise as the 4- and 8-kHz octave bands were eliminated. Much larger increases in SRT occurred when the 2-kHz octave band was eliminated, and bandwidth dropped below 2.5 kHz. Reliability was not degraded substantially until bandwidth dropped below 2.5 kHz. The statistical reliability and efficiency of the test suit it to practical applications in which measures of speech intelligibility are required.


Subject(s)
Noise , Speech Perception , Speech Reception Threshold Test , Acoustic Stimulation , Adolescent , Adult , Auditory Threshold , Female , Humans , Male , Middle Aged , Phonetics , Reproducibility of Results , Speech Acoustics , Speech Intelligibility , Speech Reception Threshold Test/standards
4.
Percept Psychophys ; 49(5): 399-411, 1991 May.
Article in English | MEDLINE | ID: mdl-2057306

ABSTRACT

This three-part study demonstrates that perceptual order can influence the integration of acoustic speech cues. In Experiment 1, the subjects labeled the [s] and [sh] in natural FV and VF syllables in which the frication was replaced with synthetic stimuli. Responses to these "hybrid" stimuli were influenced by cues in the vocalic segment as well as by the synthetic frication. However, the influence of the preceding vocalic cues was considerably weaker than was that of the following vocalic cues. Experiment 2 examined the acoustic bases for this asymmetry and consisted of analyses revealing that FV and VF syllables are similar in terms of the acoustic structures thought to underlie the vocalic context effects. Experiment 3 examined the perceptual bases for the asymmetry. A subset of the hybrid FV and VF stimuli were presented in reverse, such that the acoustic and perceptual bases for the asymmetry were pitted against each other in the listening task. The perceptual bases (i.e., the perceived order of the frication and vocalic cues) proved to be the determining factor. Current auditory processing models, such as backward recognition masking, preperceptual auditory storage, or models based on linguistic factors, do not adequately account for the observed asymmetries.


Subject(s)
Attention , Phonetics , Speech Perception , Humans , Sound Spectrography
5.
J Acoust Soc Am ; 82(4): 1152-61, 1987 Oct.
Article in English | MEDLINE | ID: mdl-3680774

ABSTRACT

This study investigated the cues for consonant recognition that are available in the time-intensity envelope of speech. Twelve normal-hearing subjects listened to three sets of spectrally identical noise stimuli created by multiplying noise with the speech envelopes of 19(aCa) natural-speech nonsense syllables. The speech envelope for each of the three noise conditions was derived using a different low-pass filter cutoff (20, 200, and 2000 Hz). Average consonant identification performance was above chance for the three noise conditions and improved significantly with the increase in envelope bandwidth from 20-200 Hz. SINDSCAL multidimensional scaling analysis of the consonant confusions data identified three speech envelope features that divided the 19 consonants into four envelope feature groups ("envemes"). The enveme groups in combination with visually distinctive speech feature groupings ("visemes") can distinguish most of the 19 consonants. These results suggest that near-perfect consonant identification performance could be attained by subjects who receive only enveme and viseme information and no spectral information.


Subject(s)
Phonetics , Sound Spectrography , Speech Perception , Cochlear Implants/standards , Cues , Hearing Aids/standards , Humans
6.
J Acoust Soc Am ; 79(3): 826-37, 1986 Mar.
Article in English | MEDLINE | ID: mdl-3958325

ABSTRACT

The perceptual representation of speech is generally assumed to be discrete rather than continuous, pointing to the need for general discrete analytic models to represent observed perceptual similarities among speech sounds. The INDCLUS (INdividual Differences CLUStering) model and algorithm [J.D. Carroll and P. Arabie, Psychometrika 48, 157-169 (1983)] can provide this generality, representing symmetric three-way similarity data (stimuli X stimuli X conditions) as an additive combination of overlapping, and generally not hierarchial, clusters whose weights (which are numerical values gauging the importance of the clusters) vary both as a function of the cluster and condition being considered. INDCLUS was used to obtain a discrete representation of underlying perceptual structure in the Miller and Nicely consonant confusion data [G.A. Miller and P.E. Nicely, J. Acoust. Soc. Am. 27, 338-352 (1955)]. A 14-cluster solution accounted for 82.9% of total variance across the 17 listening conditions. The cluster composition and the variations in cluster weights as a function of stimulus degradation were interpreted in terms of the common and unique perceptual attributes of the consonants within each cluster. Low-pass filtering and noise masking selectively degraded unique attributes, especially the cues for place of articulation, while high-pass filtering degraded both unique and common attributes. The clustering results revealed that perceptual similarities among consonants are accurately modeled by additive combinations of their specific and discrete acoustic attributes whose weights are determined by the nature of the stimulus degradation.


Subject(s)
Attention , Phonetics , Speech Perception , Cues , Humans , Perceptual Masking , Psychoacoustics
7.
J Acoust Soc Am ; 73(6): 2150-65, 1983 Jun.
Article in English | MEDLINE | ID: mdl-6875101

ABSTRACT

The influence of spectral cues on discrimination peaks in the region of the phonetic voicing boundary was examined. The discriminability of voice onset time (VOT) differences of the same temporal magnitude was assessed using stimuli from labial and velar consonant-vowel VOT continua that differed in the timing of spectral changes associated with the first formant (F1) transition, and in the location of the phonetic boundary. Subjects were initially given labeling tests and fixed-standard AX and all-step discrimination tests on both series. Half the subjects then received all-step discrimination training on one series and half received training on the other series. Finally, all subjects were again given the labeling and discrimination tests on both series. Just noticeable differences (jnds) in VOT were estimated from the all-step functions before and after training. Initial jnds showed that VOT discrimination was most accurate around the voicing boundary on the two continua, where differences in F1 onset frequency accompany variations in VOT. jnds on both series decreased significantly after training, although these regions of greater sensitivity remained. No evidence was seen of increased sensitivity around +/- 20-ms VOT, as expected if auditory processing constraints were influencing temporal order judgments. Comparisons of post-training jnds within and across series indicated that spectral components of VOT, primarily F1 onset frequency differences, exert a substantial influence on discrimination, and, along with other spectral cues provided by source differences at stimulus onset, can account for the discontinuities in discrimination often reported in research with VOT continua. Large phonetic effects also were seen in the initial performance of all subjects: jnds decreased consistently as standards drew nearer the voicing boundary. However, these effects were absent in the final jnds for most subjects. Implications of these findings for the understanding of basic auditory and attentional processes in speech perception are discussed.


Subject(s)
Phonetics , Psychoacoustics , Speech Perception , Adult , Cross-Cultural Comparison , Cues , Differential Threshold , Humans , Learning , Memory , Sound Spectrography , Speech Acoustics
10.
J Acoust Soc Am ; 70(4): 966-75, 1981 Oct.
Article in English | MEDLINE | ID: mdl-7288043

ABSTRACT

Four studies investigated the perceptual effects of spectral variations in fricatives produced in different vowel contexts. The alveolar and palatal fricatives, [s, z, integral of, 3], were produced by two talkers in the context of the vowels [a, i, u], generating 12 fricative-vowel combinations. A computer-controlled editing procedure was used to excise fricative segments of 150-ms duration, as measured back from vowel onset. These excised segments were used as test stimuli in the four experiments. In the first experiment, fricative identification was highly accurate, especially for segments produced in the [a] context. The results of the subsequent three vowel identification experiments, revealed that the high vowels [i] and [u] were identified 60%--80% of the time in all fricative contexts, with the exception of [i] produced in the context of [integral of]. In contrast, identification scores for [a] were close to chance in all fricative contexts. Acoustic analyses of the stimuli revealed that the fricative segments with high vowel identification scores exhibited clear evidence of spectral changes associated with the vowels, while those segments with the highest fricative identification scores exhibited spectra most similar to fricatives produced in isolation. These results, in combination with more extensive acoustic analyses [S. D. Soli, J. Acoust. Soc. Am. 70, 976--984 (1981)] are discussed in terms of variations in the articulatory compatibility of tongue movements required to produce fricative-vowel sequences.


Subject(s)
Speech Acoustics , Speech Perception/physiology , Speech , Humans , Tongue/physiology
13.
J Exp Psychol Hum Percept Perform ; 6(4): 622-38, 1980 Nov.
Article in English | MEDLINE | ID: mdl-6449534

ABSTRACT

How do acoustic attributes of the speech signal contribute to feature-processing interactions that occur in phonetic classification? In a series of five experiments addressed to this question, listeners performed speeded classification tasks that explicitly required a phonetic decision for each response. Stimuli were natural consonant-vowel syllables differing by multiple phonetic features, although classification responses were based on a single target feature. In control tasks, no variations in nontarget features occurred, whereas in orthogonal tasks nonrelevant feature variations occurred but had to be ignored. Comparison of classification times demonstrated that feature information may either be processed separately as independent cues for each feature or as a single integral segment that jointly specifies several features. The observed form on processing depended on the acoustic manifestations of feature variation in the signal. Stop-consonant place of articulation and voicing cues, conveyed independently by the pattern and excitation source of the initial formant transitions, may be processed separately. However, information for consonant place of articulation and vowel quality, features that interactively affect the shape of initial formant transitions, are processed as an integral segment. Articulatory correlates of each type of processing are discussed in terms of the distinction between source features that vary discretely in speech production and resonance features that can change smoothly and continuously. Implications for perceptual models that include initial segmentation of an input utterance into a phonetic feature representation are also considered.


Subject(s)
Phonetics , Speech Acoustics , Speech Perception , Speech , Female , Humans , Male , Reaction Time
14.
J Acoust Soc Am ; 66(1): 46-59, 1979 Jul.
Article in English | MEDLINE | ID: mdl-489832

ABSTRACT

The utility of phonetic features versus acoustic properties for describing perceptual relations among speech sounds was evaluated with a multidimensional scaling analysis of Miller and Nicely's [J. Acoust. Soc. Am. 27, 338-352 (1955)] consonant confusions data. The INDSCAL method and program were employed with the original data log transformed to enhance consistency with the linear INDSCAL model. A four-dimensional solution accounted for 69% of the variance and was best characterized in terms of acoustic properties of the speech signal, viz., temporal relationship of periodicity and burst onset, shape of voiced first formanant transition, shape of voiced second formanant transition, and amount of initial spectral dispersion, rather than in terms of phonetic features. The amplitude and spectral location of acoustic energy specifying each perceptual dimension were found to determine a dimension's perceptual effect as the signal was degraded by masking noise and bandpass filtering. Consequently, the perceptual bases of identification confusions between pairs of syllables were characterized in terms of the shared acoustic properties which remained salient in the degraded speech. Implications of these findings for feature-based accounts of perceptual relationships between phonemes are considered.


Subject(s)
Phonetics , Psychoacoustics , Speech Perception , Female , Humans
15.
Mem Cognit ; 4(6): 673-6, 1976 Nov.
Article in English | MEDLINE | ID: mdl-21286996

ABSTRACT

In an experiment comparing memory for formal and semantic information, the confounding effects of attentional and response biases were controlled using an adaptation of Sachs' (1967, 1974) method. Subjects attempted to recognize semantic or formal changes in test sentences following short passages of connected discourse. Attentional biases were controlled by using a single type of change in an experimental session, and response biases were controlled with methods from signal detection theory. Semantic recognition scores were consistently above formal scores, both within subjects and within passages, indicating that superiority of semantic performance is attributable to differences in memorability rather than to biases favoring semantic performance. However, formal scores were above chance, suggesting that the poor memory for formal information, as reported previously, may have been due to performance factors.

SELECTION OF CITATIONS
SEARCH DETAIL
...