Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 19 de 19
Filter
Add more filters










Publication year range
1.
Front Psychol ; 10: 2395, 2019.
Article in English | MEDLINE | ID: mdl-31780980

ABSTRACT

A subtle property of speech gestures is the fact that they are spatially and temporally extended, meaning that phonological contrasts are expressed using spatially extended constrictions, and have a finite duration. This paper shows how this spatiotemporal particulation of the vocal tract, for the purpose of linguistic signaling, comes about. It is argued that local uniform computations among topographically organized microscopic units that either constrict or relax individual points of the vocal tract yield the global spatiotemporal macroscopic structures we call constrictions, the locus of phonological contrast. The dynamical process is a morphogenetic one, based on the Turing and Hopf patterns of mathematical physics and biology. It is shown that reaction-diffusion equations, which are introduced in a tutorial mathematical style, with simultaneous Turing and Hopf patterns predict the spatiotemporal particulation, as well as concrete properties of speech gestures, namely the pivoting of constrictions, as well as the intermediate value of proportional time to peak velocity, which is well-studied and observed. The goal of the paper is to contribute to Bernstein's program of understanding motor processes as the emergence of low degree of freedom descriptions from high degree of freedom systems by actually pointing to specific, predictive, dynamics that yield speech gestures from a reaction-diffusion morphogenetic process.

2.
J Acoust Soc Am ; 144(2): 897, 2018 08.
Article in English | MEDLINE | ID: mdl-30180671

ABSTRACT

In previous research, mutual information (MI) was employed to quantify the physical information shared between consecutive phonological segments, based on electromagnetic articulography data. In this study, MI is extended to quantifying coarticulatory resistance (CR) versus overlap in German using ultrasound imaging. Two measurements are tested as input to MI: (1) the highest point on the tongue body and (2) the first coefficient of the discrete Fourier transform (DFT) of the whole tongue contour. Both measures are used to examine changes in coarticulation between two time points during the syllable span: the consonant midpoint and the vowel onset. Results corroborate previous findings reporting differences in coarticulatory overlap in German and across languages. Further, results suggest that MI used with the highest point on the tongue body captures distinctions related both to place and manner of articulation, while the first DFT coefficient does not provide any additional information regarding global (whole tongue) as opposed to local (individual articulator) aspects of CR. However, both methods capture temporal distinctions in coarticulatory resistance between the two time points. Results are discussed with respect to the potential of MI measure to provide a way of unifying coarticulation quantification methods across data collection techniques.


Subject(s)
Language , Phonetics , Speech Acoustics , Adult , Female , Fourier Analysis , Humans , Male , Speech Perception , Ultrasonics/methods , Voice
3.
J Acoust Soc Am ; 138(2): 1221-32, 2015 Aug.
Article in English | MEDLINE | ID: mdl-26328735

ABSTRACT

Regression analysis and mutual information have been used to measure the degree of dependence between a consonant and a vowel, and this has been used to identify the invariance of consonant place and to quantify the coarticulatory resistance of consonants [e.g., Fowler (1994). Percept. Psychophys. 55, 597-610]. This paper presents the first application of this approach to measure coarticulatory properties of vowels, using regression analysis and mutual information on articulatory data of CV syllables produced by seven Taiwan Mandarin speakers. The results show that vowel /i/ shares the most information with the preceding consonant among vowels for the tongue body, whereas vowels /a/ and /u/ are not significantly different from each other in that respect. For the lip articulator, the degree of information sharing for vowels is in the progression: /u/ > /i/ > /a/. Based on the CV model theory of gestural coordination (C-V in-phase relation) and the present results, this study proposes that landmark statistics for vowels reflect the degree of vowel aggression and that V-to-C effect is dominant over C-to-V effect in C-V coarticulation.


Subject(s)
Phonetics , Speech Perception/physiology , Speech/physiology , Adult , Female , Humans , Linear Models , Lip/physiology , Male , Models, Theoretical , Speech Production Measurement , Tongue/physiology
4.
Lab Phonol ; 5(2): 271-288, 2014 May 01.
Article in English | MEDLINE | ID: mdl-25101144

ABSTRACT

The nature of the links between speech production and perception has been the subject of longstanding debate. The present study investigated the articulatory parameter of tongue height and the acoustic F1-F0 difference for the phonological distinction of vowel height in American English front vowels. Multiple repetitions of /i, ɪ, e, ε, æ/ in [(h)Vd] sequences were recorded in seven adult speakers. Articulatory (ultrasound) and acoustic data were collected simultaneously to provide a direct comparison of variability in vowel production in both domains. Results showed idiosyncratic patterns of articulation for contrasting the three front vowel pairs /i-ɪ/, /e-ε/ and /ε-æ/ across subjects, with the degree of variability in vowel articulation comparable to that observed in the acoustics for all seven participants. However, contrary to what was expected, some speakers showed reversals for tongue height for /ɪ/-/e/ that was also reflected in acoustics with F1 higher for /ɪ/ than for /e/. The data suggest the phonological distinction of height is conveyed via speaker-specific articulatory-acoustic patterns that do not strictly match features descriptions. However, the acoustic signal is faithful to the articulatory configuration that generated it, carrying the crucial information for perceptual contrast.

5.
J Acoust Soc Am ; 134(5): 3808-17, 2013 Nov.
Article in English | MEDLINE | ID: mdl-24180790

ABSTRACT

Previous work has shown that velar stops are produced with a forward movement during closure, forming a forward (anterior) loop for a VCV sequence, when the preceding vowels are back or mid. Are listeners aware of this aspect of articulatory dynamics? The current study used articulatory synthesis to examine how such kinematic patterns are reflected in the acoustics, and whether those acoustic patterns elicit different goodness ratings. In Experiment I, the size and direction of loops was modulated in articulatory synthesis. The resulting stimuli were presented to listeners for a naturalness judgment. Results show that listeners rate forward loops as more natural than backward loops, in agreement with typical productions. Acoustic analysis of the synthetic stimuli shows that forward loops exhibit shorter and shallower VC transitions than CV transitions. In Experiment II, three acoustic parameters were employed incorporating F3-F2 distance, transition slope, and transition length to systematically modulate the magnitude of VC and CV transitions. Listeners rated the naturalness in accord with those of Experiment I. This study reveals that there is sufficient information in the acoustic signature of "velar loops" to affect perceptual preference. Similarity to typical productions seemed to determine preferences, not acoustic distinctiveness.


Subject(s)
Speech Acoustics , Speech Perception , Tongue/physiology , Voice Quality , Acoustic Stimulation , Audiometry, Speech , Biomechanical Phenomena , Discrimination, Psychological , Female , Humans , Male , Movement , Pattern Recognition, Physiological , Phonetics , Sound Spectrography , Time Factors
6.
Speech Commun ; 55(1): 147-161, 2013 Jan.
Article in English | MEDLINE | ID: mdl-24052685

ABSTRACT

We present and evaluate two statistical methods for estimating kinematic relationships of the speech production system: Artificial Neural Networks and Locally-Weighted Regression. The work is motivated by the need to characterize this motor system, with particular focus on estimating differential aspects of kinematics. Kinematic analysis will facilitate progress in a variety of areas, including the nature of speech production goals, articulatory redundancy and, relatedly, acoustic-to-articulatory inversion. Statistical methods must be used to estimate these relationships from data since they are infeasible to express in closed form. Statistical models are optimized and evaluated - using a heldout data validation procedure - on two sets of synthetic speech data. The theoretical and practical advantages of both methods are also discussed. It is shown that both direct and differential kinematics can be estimated with high accuracy, even for complex, nonlinear relationships. Locally-Weighted Regression displays the best overall performance, which may be due to practical advantages in its training procedure. Moreover, accurate estimation can be achieved using only a modest amount of training data, as judged by convergence of performance. The algorithms are also applied to real-time MRI data, and the results are generally consistent with those obtained from synthetic data.

7.
J Acoust Soc Am ; 134(2): 1271-82, 2013 Aug.
Article in English | MEDLINE | ID: mdl-23927125

ABSTRACT

Coarticulation and invariance are two topics at the center of theorizing about speech production and speech perception. In this paper, a quantitative scale is proposed that places coarticulation and invariance at the two ends of the scale. This scale is based on physical information flow in the articulatory signal, and uses Information Theory, especially the concept of mutual information, to quantify these central concepts of speech research. Mutual Information measures the amount of physical information shared across phonological units. In the proposed quantitative scale, coarticulation corresponds to greater and invariance to lesser information sharing. The measurement scale is tested by data from three languages: German, Catalan, and English. The relation between the proposed scale and several existing theories of coarticulation is discussed, and implications for existing theories of speech production and perception are presented.


Subject(s)
Motor Skills , Phonation , Phonetics , Speech Acoustics , Speech Intelligibility , Speech Perception , Stomatognathic System/innervation , Voice Quality , Biomechanical Phenomena , Electromagnetic Phenomena , Female , Humans , Information Theory , Linear Models , Male , Speech Production Measurement
8.
J Acoust Soc Am ; 133(1): 444-52, 2013 Jan.
Article in English | MEDLINE | ID: mdl-23297916

ABSTRACT

The present study focuses on differences in lingual coarticulation between French children and adults. The specific question pursued is whether 4-5 year old children have already acquired a synergy observed in adults in which the tongue back helps the tip in the formation of alveolar consonants. Locus equations, estimated from acoustic and ultrasound imaging data were used to compare coarticulation degree between adults and children and further investigate differences in motor synergy between the front and back parts of the tongue. Results show similar slope and intercept patterns for adults and children in both the acoustic and articulatory domains, with an effect of place of articulation in both groups between alveolar and non-alveolar consonants. These results suggest that 4-5 year old children (1) have learned the motor synergy investigated and (2) have developed a pattern of coarticulatory resistance depending on a consonant place of articulation. Also, results show that acoustic locus equations can be used to gauge the presence of motor synergies in children.


Subject(s)
Acoustics , Language Development , Motor Activity , Speech Acoustics , Tongue/diagnostic imaging , Tongue/innervation , Voice Quality , Adult , Age Factors , Child, Preschool , Humans , Regression Analysis , Speech Production Measurement , Ultrasonography , Video Recording
9.
Lab Phonol ; 3(1): 195-210, 2012 May.
Article in English | MEDLINE | ID: mdl-24765216

ABSTRACT

Using the framework of Articulatory Phonology, we offer a phonological account of the allophonic variation undergone by the velar fricative phoneme in Navajo, a Southern or Apachean Athabaskan language spoken in Arizona and New Mexico. The Navajo velar fricative strongly coarticulates with the following vowel, varying in both place and manner of articulation. The variation in this velar fricative seems greater than the variation of velars in many well-studied languages. The coronal central fricatives in the inventory, in contrast, are quite phonetically stable. The back fricative of Navajo thus highlights 1) the linguistic use of an extreme form of coarticulation and 2) the mechanism by which languages can control coarticulation. It is argued that the task dynamic model underlying Articulatory Phonology, with the mechanism of gestural blending controlling coarticulation, can account for the multiplicity of linguistically-controlled ways in which velars coarticulate with surrounding vowels without requiring any changes of input specification due to context. The ability of phonological and morphological constraints to restrict the amount of coarticulation argues against strict separation of phonetics and phonology.

10.
J Acoust Soc Am ; 129(2): 944-54, 2011 Feb.
Article in English | MEDLINE | ID: mdl-21361451

ABSTRACT

Due to its aerodynamic, articulatory, and acoustic complexities, the fricative /s/ is known to require high precision in its control, and to be highly resistant to coarticulation. This study documents in detail how jaw, tongue front, tongue back, lips, and the first spectral moment covary during the production of /s/, to establish how coarticulation affects this segment. Data were obtained from 24 speakers in the Wisconsin x-ray microbeam database producing /s/ in prevocalic and pre-obstruent sequences. Analysis of the data showed that certain aspects of jaw and tongue motion had specific kinematic trajectories, regardless of context, and the first spectral moment trajectory corresponded to these in some aspects. In particular contexts, variability due to jaw motion is compensated for by tongue-tip motion and bracing against the palate, to maintain an invariant articulatory-aerodynamic goal, constriction degree. The change in the first spectral moment, which rises to a peak at the midpoint of the fricative, primarily reflects the motion of the jaw. Implications of the results for theories of speech motor control and acoustic-articulatory relations are discussed.


Subject(s)
Jaw/physiology , Language , Mouth/physiology , Phonetics , Speech Acoustics , Biomechanical Phenomena , Databases as Topic , Female , Friction , Humans , Jaw/diagnostic imaging , Lip/physiology , Male , Mouth/diagnostic imaging , Radiography , Sound Spectrography , Speech Production Measurement , Tongue/physiology , Young Adult
11.
J Acoust Soc Am ; 128(4): 2021-32, 2010 Oct.
Article in English | MEDLINE | ID: mdl-20968373

ABSTRACT

The study investigated the articulatory basis of locus equations, regression lines relating F2 at the start of a Consonant-Vowel (CV) transition to F2 at the middle of the vowel, with C fixed and V varying. Several studies have shown that consonants of different places of articulation have locus equation slopes that descend from labial to velar to alveolar, and intercept magnitudes that increase in the opposite order. Using formulas from the theory of bivariate regression that express regression slopes and intercepts in terms of standard deviations and averages of the variables, it is shown that the slope directly encodes a well-established measure of coarticulation resistance. It is also shown that intercepts are directly related to the degree to which the tongue body assists the formation of the constriction for the consonant. Moreover, it is shown that the linearity of locus equations and the linear relation between locus equation slopes and intercepts originates in linearity in articulation between the horizontal position of the tongue dorsum in the consonant and to that in the vowel. It is concluded that slopes and intercepts of acoustic locus equations are measures of articulator synergy.


Subject(s)
Models, Statistical , Mouth/physiology , Phonation , Phonetics , Speech Acoustics , Female , Humans , Jaw/physiology , Lip/physiology , Regression Analysis , Tongue/physiology , Vocal Cords/physiology
12.
J Phon ; 38(3): 375-387, 2010 Jul 01.
Article in English | MEDLINE | ID: mdl-20871808

ABSTRACT

The area function of the vocal tract in all of its spatial detail is not directly computable from the speech signal. But is partial, yet phonetically distinctive, information about articulation recoverable from the acoustic signal that arrives at the listener's ear? The answer to this question is important for phonetics, because various theories of speech perception predict different answers. Some theories assume that recovery of articulatory information must be possible, while others assume that it is impossible. However, neither type of theory provides firm evidence showing that distinctive articulatory information is or is not extractable from the acoustic signal. The present study focuses on vowel gestures and examines whether linguistically significant information, such as the constriction location, constriction degree, and rounding, is contained in the speech signal, and whether such information is recoverable from formant parameters. Perturbation theory and linear prediction were combined, in a manner similar to that in Mokhtari (1998) [Mokhtari, P. (1998). An acoustic-phonetic and articulatory study of speech-speaker dichotomy. Doctoral dissertation, University of New South Wales], to assess the accuracy of recovery of information about vowel constrictions. Distinctive constriction information estimated from the speech signal for ten American English vowels were compared to the constriction information derived from simultaneously collected X-ray microbeam articulatory data for 39 speakers [Westbury (1994). Xray microbeam speech production database user's handbook. University of Wisconsin, Madison, WI]. The recovery of distinctive articulatory information relies on a novel technique that uses formant frequencies and amplitudes, and does not depend on a principal components analysis of the articulatory data, as do most other inversion techniques. These results provide evidence that distinctive articulatory information for vowels can be recovered from the acoustic signal.

13.
J Acoust Soc Am ; 127(6): 3717-28, 2010 Jun.
Article in English | MEDLINE | ID: mdl-20550270

ABSTRACT

This study investigated the degree to which the articulatory trajectory of the tongue dorsum in the production of a vowel-vowel sequence is perceptually relevant. Previous research has shown that the tongue dorsum takes a path that leads to a pattern of area function change, termed the pivot pattern. In this study, articulatory synthesis was used to generate paths of tongue motion for the production of the vowel sequence /ai/. These paths differed in their curvature, leading to stimuli that conform to the pivot pattern and stimuli that violate it. Participants gave naturalness ratings and discriminated the stimuli. The acoustic properties were also compared to acoustic measurements made on productions of /ai/ by 34 speakers. The curvature of the tongue path and the curvature of the F1-F2 trajectory correlate highly with the naturalness-rating task results, but not the discrimination results. However, the particular way in which constriction location changes, particularly whether the change is discrete or continuous, and the maximal velocity of F2 through the transition, explain the perceptual patterns evident in both perception tasks, as well as the patterns in the observed acoustic data. Consequences of these results for the links between production and perception and the segmentation problem are discussed.


Subject(s)
Speech Perception , Speech/physiology , Tongue/physiology , Discrimination, Psychological , Female , Humans , Male , Motor Activity/physiology , Mouth/physiology , Phonetics , Psychoacoustics , Speech Acoustics
14.
J Acoust Soc Am ; 127(3): 1507-18, 2010 Mar.
Article in English | MEDLINE | ID: mdl-20329851

ABSTRACT

A structural magnetic resonance imaging study has revealed that pharyngeal articulation varies considerably with voicing during the production of English fricatives. In a study of four speakers of American English, pharyngeal volume was generally found to be greater during the production of sustained voiced fricatives, compared to voiceless equivalents. Though pharyngeal expansion is expected for voiced stops, it is more surprising for voiced fricatives. For three speakers, all four voiced oral fricatives were produced with a larger pharynx than that used during the production of the voiceless fricative at the same place of articulation. For one speaker, pharyngeal volume during the production of voiceless labial fricatives was found to be greater, and sibilant pharyngeal volume varied with vocalic context as well as voicing. Pharyngeal expansion was primarily achieved through forward displacement of the anterior and lateral walls of the upper pharynx, but some displacement of the rear pharyngeal wall was also observed. These results suggest that the production of voiced fricatives involves the complex interaction of articulatory constraints from three separate goals: the formation of the appropriate oral constriction, the control of airflow through the constriction so as to achieve frication, and the maintenance of glottal oscillation by attending to transglottal pressure.


Subject(s)
Magnetic Resonance Imaging , Pharynx/anatomy & histology , Pharynx/physiology , Speech/physiology , Voice/physiology , Adult , Female , Glottis/anatomy & histology , Glottis/physiology , Humans , Larynx/anatomy & histology , Larynx/physiology , Male , Models, Biological , Phonetics , Young Adult
15.
J Phon ; 38(4): 625-639, 2010 Oct 01.
Article in English | MEDLINE | ID: mdl-21218130

ABSTRACT

Russian maintains a contrast between non-palatalized and palatalized trills that has been lost in most Slavic languages. This research investigates the phonetic expression of this contrast in an attempt to understand how the contrast is maintained. One hypothesis is that the contrast is stabilized through resistance to coarticulation between the trill and surrounding vowels and prosodic positional weakening effects-factors expected to weaken the contrast. In order to test this hypothesis, we investigate intrasegmental and intersegmental coarticulation and the effect of domain boundaries on Russian trills. Since trills are highly demanding articulatorily and aerodynamically, and since Russian trills are in contrast, there is an expectation that they will be highly resistant to coarticulation and to prosodic influence. This study shows, however, that phonetic variability due to domain boundaries and coarticulation is systematically present in Russian trills. Implications of the relation between prosodic position and lingual coarticulation for the Degree of Articulatory Constraint (DAC) model, Articulatory Phonology, and the literature on prosodic strength are discussed. Based on the quantitative analysis of phonetic variability in Russian trills, we conjecture a hypothesis on why the contrast in trills is maintained in Russian, but lost in other Slavic languages. Specifically, phonological strategies used by several Slavic languages to deal with the instability of Proto-Slavic palatalized trills are present phonetically in Russian. These phonetic tendencies structure the variability of Russian trills, and could be the source of contrast stabilization.

16.
Clin Linguist Phon ; 19(6-7): 555-65, 2005.
Article in English | MEDLINE | ID: mdl-16206483

ABSTRACT

The goal of this paper is to provide a tutorial introduction to the topic of edge detection of the tongue from ultrasound scans for researchers in speech science and phonetics. The method introduced here is Active Contours (also called snakes), a method for searching for an edge, assuming that it is a smooth curve in the image data. The advantage of this approach is that it is robust to the noisy speckle that clouds edges. This method has been implemented in several software packages currently used for detecting the edge of the tongue in ultrasound images. The tutorial concludes with an overview of the scale-space and Kalman filter approaches, state-of-the-art developments in image processing that will likely influence work on tongue edge detection in the coming years.


Subject(s)
Tongue/anatomy & histology , Tongue/diagnostic imaging , Humans , Image Enhancement , Image Processing, Computer-Assisted/methods , Ultrasonography
17.
J Speech Lang Hear Res ; 48(3): 543-53, 2005 Jun.
Article in English | MEDLINE | ID: mdl-16197271

ABSTRACT

The tongue is critical in the production of speech, yet its nature has made it difficult to measure. Not only does its ability to attain complex shapes make it difficult to track, it is also largely hidden from view during speech. The present article describes a new combination of optical tracking and ultrasound imaging that allows for a noninvasive, real-time view of most of the tongue surface during running speech. The optical system (Optotrak) tracks the location of external structures in 3-dimensional space using infrared emitting diodes (IREDs). By tracking 3 or more IREDs on the head and a similar number on an ultrasound transceiver, the transduced image of the tongue can be corrected for the motion of both the head and the transceiver and thus be represented relative to the hard structures of the vocal tract. If structural magnetic resonance images of the speaker are available, they may allow the estimation of the location of the rear pharyngeal wall as well. This new technique is contrasted with other currently available options for imaging the tongue. It promises to provide high-quality, relatively low-cost imaging of most of the tongue surface during fairly unconstrained speech.


Subject(s)
Movement/physiology , Speech/physiology , Tongue/diagnostic imaging , Humans , Imaging, Three-Dimensional , Infrared Rays , Magnetic Resonance Imaging , Palate/diagnostic imaging , Ultrasonography , Videotape Recording
18.
Clin Linguist Phon ; 18(6-8): 507-21, 2004.
Article in English | MEDLINE | ID: mdl-15573487

ABSTRACT

The tongue is a deformable object, and moves by compressing or expanding local functional segments. For any single phoneme, these functional tongue segments may move in similar or opposite directions, and may reach target maximum synchronously or not. This paper will discuss the independence of five proposed segments in the production of speech. Three studies used ultrasound and tagged Cine-MRI to explore the independence of the tongue segments. High correlations between tongue segments would suggest passive biomechanical constraints and low correlations would suggest active independent control. Both physiological and higher level linguistic constraints were seen in the correlation patterns. Physiological constraints were supported by high correlations between adjacent segments (positive) and distant segments (negative). Linguistic constraints were supported by segmental correlations that changed with the phonemic content of the task.


Subject(s)
Muscle Contraction/physiology , Speech/physiology , Tongue/physiology , Adult , Humans , Imaging, Three-Dimensional , Magnetic Resonance Imaging , Male , Models, Biological , Tongue/diagnostic imaging , Tongue/innervation , Ultrasonography
19.
Lang Speech ; 47(Pt 2): 155-74, 2004.
Article in English | MEDLINE | ID: mdl-15581190

ABSTRACT

The ability of speakers to exaggerate speech sounds ("hyperarticulation") has led to the theory that the targets themselves must be hyperspace hyperarticulated. Johnson, Flemming, and Wright (1993) found that perceptual "best exemplar" choices for vowels were more speech extreme than listeners' own productions. Our first experiment, using their procedure, only partially replicated their results. Low vowels vowel perception showed a higher F1, consistent with hyperspace. Front vowels also showed more frontness in F2, but back vowels were less extreme ("hypoarticulated") on F2. Our second experiment used an identification and rating of each stimulus, yielding similar results of a smaller magnitude. Our results indicate that the perceptual space is calibrated to a particular (synthetic) vowel space, which is not related straightforwardly to the speakers' spaces. The original hyperspace hypothesis can be attributed to the methodology which led to extreme judgments and of the fronting of back vowels in California English. The present results indicate that no such hypothesis is needed. Vowel targets are measurable from an individual's productions, and the individual's perception of other speakers (even synthetic ones) is based on information about the vocal tract and dialect of the speaker.


Subject(s)
Phonation , Speech Perception , Adult , Female , Humans , Male , Speech Discrimination Tests , Verbal Behavior
SELECTION OF CITATIONS
SEARCH DETAIL
...