Pesquisa | Portal Regional da BVS

1.

Variation in compensatory strategies as a function of target constriction degree in post-glossectomy speech.

Hagedorn, Christina; Lu, Yijing; Toutios, Asterios; Sinha, Uttam; Goldstein, Louis; Narayanan, Shrikanth.

JASA Express Lett ; 2(4): 045205, 2022 Apr.

Artigo em Inglês | MEDLINE | ID: mdl-35495774

RESUMO

Individuals who have undergone treatment for oral cancer oftentimes exhibit compensatory behavior in consonant production. This pilot study investigates whether compensatory mechanisms utilized in the production of speech sounds with a given target constriction location vary systematically depending on target manner of articulation. The data reveal that compensatory strategies used to produce target alveolar segments vary systematically as a function of target manner of articulation in subtle yet meaningful ways. When target constriction degree at a particular constriction location cannot be preserved, individuals may leverage their ability to finely modulate constriction degree at multiple constriction locations along the vocal tract.

2.

A multispeaker dataset of raw and reconstructed speech production real-time MRI video and 3D volumetric images.

Lim, Yongwan; Toutios, Asterios; Bliesener, Yannick; Tian, Ye; Lingala, Sajan Goud; Vaz, Colin; Sorensen, Tanner; Oh, Miran; Harper, Sarah; Chen, Weiyi; Lee, Yoonjeong; Töger, Johannes; Monteserin, Mairym Lloréns; Smith, Caitlin; Godinez, Bianca; Goldstein, Louis; Byrd, Dani; Nayak, Krishna S; Narayanan, Shrikanth S.

Sci Data ; 8(1): 187, 2021 07 20.

Artigo em Inglês | MEDLINE | ID: mdl-34285240

RESUMO

Real-time magnetic resonance imaging (RT-MRI) of human speech production is enabling significant advances in speech science, linguistics, bio-inspired speech technology development, and clinical applications. Easy access to RT-MRI is however limited, and comprehensive datasets with broad access are needed to catalyze research across numerous domains. The imaging of the rapidly moving articulators and dynamic airway shaping during speech demands high spatio-temporal resolution and robust reconstruction methods. Further, while reconstructed images have been published, to-date there is no open dataset providing raw multi-coil RT-MRI data from an optimized speech production experimental setup. Such datasets could enable new and improved methods for dynamic image reconstruction, artifact correction, feature extraction, and direct extraction of linguistically-relevant biomarkers. The present dataset offers a unique corpus of 2D sagittal-view RT-MRI videos along with synchronized audio for 75 participants performing linguistically motivated speech tasks, alongside the corresponding public domain raw RT-MRI data. The dataset also includes 3D volumetric vocal tract MRI during sustained speech sounds and high-resolution static anatomical T2-weighted upper airway MRI for each participant.

Assuntos

Laringe/fisiologia , Imageamento por Ressonância Magnética/métodos , Fala , Adolescente , Adulto , Sistemas Computacionais , Feminino , Humanos , Masculino , Pessoa de Meia-Idade , Fatores de Tempo , Gravação em Vídeo , Adulto Jovem

3.

How an aglossic speaker produces an alveolar-like percept without a functional tongue tip.

Toutios, Asterios; Xu, Melissa; Byrd, Dani; Goldstein, Louis; Narayanan, Shrikanth.

J Acoust Soc Am ; 147(6): EL460, 2020 06.

Artigo em Inglês | MEDLINE | ID: mdl-32611190

RESUMO

It has been previously observed [McMicken, Salles, Berg, Vento-Wilson, Rogers, Toutios, and Narayanan. (2017). J. Commun. Disorders, Deaf Stud. Hear. Aids 5(2), 1-6] using real-time magnetic resonance imaging that a speaker with severe congenital tongue hypoplasia (aglossia) had developed a compensatory articulatory strategy where she, in the absence of a functional tongue tip, produced a plosive consonant perceptually similar to /d/ using a bilabial constriction. The present paper provides an updated account of this strategy. It is suggested that the previously observed compensatory bilabial closing that occurs during this speaker's /d/ production is consistent with vocal tract shaping resulting from hyoid raising created with mylohyoid action, which may also be involved in typical /d/ production. Simulating this strategy in a dynamic articulatory synthesis experiment leads to the generation of /d/-like formant transitions.

Assuntos

Língua , Voz , Feminino , Humanos , Fonética , Fala , Língua/diagnóstico por imagem

4.

Vocal tract shaping of emotional speech.

Kim, Jangwon; Toutios, Asterios; Lee, Sungbok; Narayanan, Shrikanth S.

Comput Speech Lang ; 642020 Nov.

Artigo em Inglês | MEDLINE | ID: mdl-32523241

RESUMO

Emotional speech production has been previously studied using fleshpoint tracking data in speaker-specific experiment setups. The present study introduces a real-time magnetic resonance imaging database of emotional speech production from 10 speakers and presents articulatory analysis results of speech emotional expression using the database. Midsagittal vocal tract parameters (midsagittal distances and the vocal tract length) were parameterized based on a two-dimensional grid-line system, using image segmentation software. The principal feature analysis technique was applied to the grid-line system in order to find the major movement locations. Results reveal both speaker-dependent and speaker-independent variation patterns. For example, sad speech, a low arousal emotion, tends to show smaller opening for low vowels in the front cavity than the high arousal emotions more consistently than the other regions of the vocal tract. Happiness shows significantly shorter vocal tract length than anger and sadness in most speakers. Further details of speaker-dependent and speaker-independent speech articulation variation in emotional expression and their implications are described.

5.

Task-dependence of articulator synergies.

Sorensen, Tanner; Toutios, Asterios; Goldstein, Louis; Narayanan, Shrikanth.

J Acoust Soc Am ; 145(3): 1504, 2019 03.

Artigo em Inglês | MEDLINE | ID: mdl-31067947

RESUMO

In speech production, the motor system organizes articulators such as the jaw, tongue, and lips into synergies whose function is to produce speech sounds by forming constrictions at the phonetic places of articulation. The present study tests whether synergies for different constriction tasks differ in terms of inter-articulator coordination. The test is conducted on utterances [ÉpÉ], [ÉtÉ], [ÉiÉ], and [ÉkÉ] with a real-time magnetic resonance imaging biomarker that is computed using a statistical model of the forward kinematics of the vocal tract. The present study is the first to estimate the forward kinematics of the vocal tract from speech production data. Using the imaging biomarker, the study finds that the jaw contributes least to the velar stop for [k], more to pharyngeal approximation for [É], still more to palatal approximation for [i], and most to the coronal stop for [t]. Additionally, the jaw contributes more to the coronal stop for [t] than to the bilabial stop for [p]. Finally, the study investigates how this pattern of results varies by participant. The study identifies differences in inter-articulator coordination by constriction task, which support the claim that inter-articulator coordination differs depending on the active articulator synergy.

Assuntos

Fala , Voz/fisiologia , Adulto , Fenômenos Biomecânicos , Feminino , Humanos , Arcada Osseodentária/diagnóstico por imagem , Arcada Osseodentária/fisiologia , Laringe/diagnóstico por imagem , Laringe/fisiologia , Imageamento por Ressonância Magnética , Masculino , Faringe/diagnóstico por imagem , Faringe/fisiologia , Fonética , Desempenho Psicomotor

6.

A modular architecture for articulatory synthesis from gestural specification.

Alexander, Rachel; Sorensen, Tanner; Toutios, Asterios; Narayanan, Shrikanth.

J Acoust Soc Am ; 146(6): 4458, 2019 12.

Artigo em Inglês | MEDLINE | ID: mdl-31893678

RESUMO

This paper proposes a modular architecture for articulatory synthesis from a gestural specification comprising relatively simple models for the vocal tract, the glottis, aero-acoustics, and articulatory control. The vocal tract module combines a midsagittal statistical analysis articulatory model, derived by factor analysis of air-tissue boundaries in real-time magnetic resonance imaging data, with an αß model for converting midsagittal section to area function specifications. The aero-acoustics and glottis models were based on a software implementation of classic work by Maeda. The articulatory control module uses dynamical systems, which implement articulatory gestures, to animate the statistical articulatory model, inspired by the task dynamics model. Results on synthesizing vowel-consonant-vowel sequences with plosive consonants, using models that were built on data from, and simulate the behavior of, two different speakers are presented.

Assuntos

Glote/fisiologia , Fonética , Acústica da Fala , Fala/fisiologia , Acústica , Gestos , Humanos

7.

Test-retest repeatability of human speech biomarkers from static and real-time dynamic magnetic resonance imaging.

Töger, Johannes; Sorensen, Tanner; Somandepalli, Krishna; Toutios, Asterios; Lingala, Sajan Goud; Narayanan, Shrikanth; Nayak, Krishna.

J Acoust Soc Am ; 141(5): 3323, 2017 05.

Artigo em Inglês | MEDLINE | ID: mdl-28599561

RESUMO

Static anatomical and real-time dynamic magnetic resonance imaging (RT-MRI) of the upper airway is a valuable method for studying speech production in research and clinical settings. The test-retest repeatability of quantitative imaging biomarkers is an important parameter, since it limits the effect sizes and intragroup differences that can be studied. Therefore, this study aims to present a framework for determining the test-retest repeatability of quantitative speech biomarkers from static MRI and RT-MRI, and apply the framework to healthy volunteers. Subjects (n = 8, 4 females, 4 males) are imaged in two scans on the same day, including static images and dynamic RT-MRI of speech tasks. The inter-study agreement is quantified using intraclass correlation coefficient (ICC) and mean within-subject standard deviation (σe). Inter-study agreement is strong to very strong for static measures (ICC: min/median/max 0.71/0.89/0.98, σe: 0.90/2.20/6.72 mm), poor to strong for dynamic RT-MRI measures of articulator motion range (ICC: 0.26/0.75/0.90, σe: 1.6/2.5/3.6 mm), and poor to very strong for velocities (ICC: 0.21/0.56/0.93, σe: 2.2/4.4/16.7 cm/s). In conclusion, this study characterizes repeatability of static and dynamic MRI-derived speech biomarkers using state-of-the-art imaging. The introduced framework can be used to guide future development of speech biomarkers. Test-retest MRI data are provided free for research use.

Assuntos

Laringe/diagnóstico por imagem , Imageamento por Ressonância Magnética , Boca/diagnóstico por imagem , Faringe/diagnóstico por imagem , Fala , Acústica , Adulto , Pontos de Referência Anatômicos , Fenômenos Biomecânicos , Feminino , Humanos , Laringe/fisiologia , Masculino , Boca/fisiologia , Faringe/fisiologia , Valor Preditivo dos Testes , Reprodutibilidade dos Testes , Medida da Produção da Fala , Fatores de Tempo , Adulto Jovem

8.

Feasibility of through-time spiral generalized autocalibrating partial parallel acquisition for low latency accelerated real-time MRI of speech.

Lingala, Sajan Goud; Zhu, Yinghua; Lim, Yongwan; Toutios, Asterios; Ji, Yunhua; Lo, Wei-Ching; Seiberlich, Nicole; Narayanan, Shrikanth; Nayak, Krishna S.

Magn Reson Med ; 78(6): 2275-2282, 2017 Dec.

Artigo em Inglês | MEDLINE | ID: mdl-28185301

RESUMO

PURPOSE: To evaluate the feasibility of through-time spiral generalized autocalibrating partial parallel acquisition (GRAPPA) for low-latency accelerated real-time MRI of speech. METHODS: Through-time spiral GRAPPA (spiral GRAPPA), a fast linear reconstruction method, is applied to spiral (k-t) data acquired from an eight-channel custom upper-airway coil. Fully sampled data were retrospectively down-sampled to evaluate spiral GRAPPA at undersampling factors R = 2 to 6. Pseudo-golden-angle spiral acquisitions were used for prospective studies. Three subjects were imaged while performing a range of speech tasks that involved rapid articulator movements, including fluent speech and beat-boxing. Spiral GRAPPA was compared with view sharing, and a parallel imaging and compressed sensing (PI-CS) method. RESULTS: Spiral GRAPPA captured spatiotemporal dynamics of vocal tract articulators at undersampling factors ≤4. Spiral GRAPPA at 18 ms/frame and 2.4 mm2 /pixel outperformed view sharing in depicting rapidly moving articulators. Spiral GRAPPA and PI-CS provided equivalent temporal fidelity. Reconstruction latency per frame was 14 ms for view sharing and 116 ms for spiral GRAPPA, using a single processor. Spiral GRAPPA kept up with the MRI data rate of 18ms/frame with eight processors. PI-CS required 17 minutes to reconstruct 5 seconds of dynamic data. CONCLUSION: Spiral GRAPPA enabled 4-fold accelerated real-time MRI of speech with a low reconstruction latency. This approach is applicable to wide range of speech RT-MRI experiments that benefit from real-time feedback while visualizing rapid articulator movement. Magn Reson Med 78:2275-2282, 2017. © 2017 International Society for Magnetic Resonance in Medicine.

Assuntos

Laringe/diagnóstico por imagem , Imageamento por Ressonância Magnética , Fala , Algoritmos , Artefatos , Calibragem , Epiglote/diagnóstico por imagem , Humanos , Aumento da Imagem , Processamento de Imagem Assistida por Computador , Modelos Estatísticos , Faringe/diagnóstico por imagem , Estudos Prospectivos , Reprodutibilidade dos Testes , Estudos Retrospectivos , Sensibilidade e Especificidade , Processamento de Sinais Assistido por Computador , Software

9.

A fast and flexible MRI system for the study of dynamic vocal tract shaping.

Lingala, Sajan Goud; Zhu, Yinghua; Kim, Yoon-Chul; Toutios, Asterios; Narayanan, Shrikanth; Nayak, Krishna S.

Magn Reson Med ; 77(1): 112-125, 2017 01.

Artigo em Inglês | MEDLINE | ID: mdl-26778178

RESUMO

PURPOSE: The aim of this work was to develop and evaluate an MRI-based system for study of dynamic vocal tract shaping during speech production, which provides high spatial and temporal resolution. METHODS: The proposed system utilizes (a) custom eight-channel upper airway coils that have high sensitivity to upper airway regions of interest, (b) two-dimensional golden angle spiral gradient echo acquisition, (c) on-the-fly view-sharing reconstruction, and (d) off-line temporal finite difference constrained reconstruction. The system also provides simultaneous noise-cancelled and temporally aligned audio. The system is evaluated in 3 healthy volunteers, and 1 tongue cancer patient, with a broad range of speech tasks. RESULTS: We report spatiotemporal resolutions of 2.4 × 2.4 mm2 every 12 ms for single-slice imaging, and 2.4 × 2.4 mm2 every 36 ms for three-slice imaging, which reflects roughly 7-fold acceleration over Nyquist sampling. This system demonstrates improved temporal fidelity in capturing rapid vocal tract shaping for tasks, such as producing consonant clusters in speech, and beat-boxing sounds. Novel acoustic-articulatory analysis was also demonstrated. CONCLUSION: A synergistic combination of custom coils, spiral acquisitions, and constrained reconstruction enables visualization of rapid speech with high spatiotemporal resolution in multiple planes. Magn Reson Med 77:112-125, 2017. © 2016 Wiley Periodicals, Inc.

Assuntos

Processamento de Imagem Assistida por Computador/métodos , Imageamento por Ressonância Magnética/métodos , Razão Sinal-Ruído , Espectrografia do Som/métodos , Fala/fisiologia , Prega Vocal/diagnóstico por imagem , Adulto , Algoritmos , Feminino , Humanos , Masculino , Processamento de Sinais Assistido por Computador , Neoplasias da Língua/diagnóstico por imagem

10.

Advances in real-time magnetic resonance imaging of the vocal tract for speech science and technology research.

Toutios, Asterios; Narayanan, Shrikanth S.

APSIPA Trans Signal Inf Process ; 52016.

Artigo em Inglês | MEDLINE | ID: mdl-27833745

RESUMO

Real-time magnetic resonance imaging (rtMRI) of the moving vocal tract during running speech production is an important emerging tool for speech production research providing dynamic information of a speaker's upper airway from the entire mid-sagittal plane or any other scan plane of interest. There have been several advances in the development of speech rtMRI and corresponding analysis tools, and their application to domains such as phonetics and phonological theory, articulatory modeling, and speaker characterization. An important recent development has been the open release of a database that includes speech rtMRI data from five male and five female speakers of American English each producing 460 phonetically balanced sentences. The purpose of the present paper is to give an overview and outlook of the advances in rtMRI as a tool for speech research and technology development.

11.

A kinematic study of critical and non-critical articulators in emotional speech production.

Kim, Jangwon; Toutios, Asterios; Lee, Sungbok; Narayanan, Shrikanth S.

J Acoust Soc Am ; 137(3): 1411-29, 2015 Mar.

Artigo em Inglês | MEDLINE | ID: mdl-25786953

RESUMO

This study explores one aspect of the articulatory mechanism that underlies emotional speech production, namely, the behavior of linguistically critical and non-critical articulators in the encoding of emotional information. The hypothesis is that the possible larger kinematic variability in the behavior of non-critical articulators enables revealing underlying emotional expression goal more explicitly than that of the critical articulators; the critical articulators are strictly controlled in service of achieving linguistic goals and exhibit smaller kinematic variability. This hypothesis is examined by kinematic analysis of the movements of critical and non-critical speech articulators gathered using eletromagnetic articulography during spoken expressions of five categorical emotions. Analysis results at the level of consonant-vowel-consonant segments reveal that critical articulators for the consonants show more (less) peripheral articulations during production of the consonant-vowel-consonant syllables for high (low) arousal emotions, while non-critical articulators show less sensitive emotional variation of articulatory position to the linguistic gestures. Analysis results at the individual phonetic targets show that overall, between- and within-emotion variability in articulatory positions is larger for non-critical cases than for critical cases. Finally, the results of simulation experiments suggest that the postural variation of non-critical articulators depending on emotion is significantly associated with the controls of critical articulators.

12.

Real-time magnetic resonance imaging and electromagnetic articulography database for speech production research (TC).

Narayanan, Shrikanth; Toutios, Asterios; Ramanarayanan, Vikram; Lammert, Adam; Kim, Jangwon; Lee, Sungbok; Nayak, Krishna; Kim, Yoon-Chul; Zhu, Yinghua; Goldstein, Louis; Byrd, Dani; Bresch, Erik; Ghosh, Prasanta; Katsamanis, Athanasios; Proctor, Michael.

J Acoust Soc Am ; 136(3): 1307, 2014 Sep.

Artigo em Inglês | MEDLINE | ID: mdl-25190403

RESUMO

USC-TIMIT is an extensive database of multimodal speech production data, developed to complement existing resources available to the speech research community and with the intention of being continuously refined and augmented. The database currently includes real-time magnetic resonance imaging data from five male and five female speakers of American English. Electromagnetic articulography data have also been presently collected from four of these speakers. The two modalities were recorded in two independent sessions while the subjects produced the same 460 sentence corpus used previously in the MOCHA-TIMIT database. In both cases the audio signal was recorded and synchronized with the articulatory data. The database and companion software are freely available to the research community.

Assuntos

Acústica , Pesquisa Biomédica , Bases de Dados Factuais , Fenômenos Eletromagnéticos , Imageamento por Ressonância Magnética , Faringe/fisiologia , Acústica da Fala , Medida da Produção da Fala , Qualidade da Voz , Acústica/instrumentação , Adulto , Fenômenos Biomecânicos , Feminino , Humanos , Masculino , Pessoa de Meia-Idade , Faringe/anatomia & histologia , Processamento de Sinais Assistido por Computador , Software , Medida da Produção da Fala/instrumentação , Fatores de Tempo , Transdutores

13.

Estimating the control parameters of an articulatory model from electromagnetic articulograph data.

Toutios, Asterios; Ouni, Slim; Laprie, Yves.

J Acoust Soc Am ; 129(5): 3245-57, 2011 May.

Artigo em Inglês | MEDLINE | ID: mdl-21568426

RESUMO

Finding the control parameters of an articulatory model that result in given acoustics is an important problem in speech research. However, one should also be able to derive the same parameters from measured articulatory data. In this paper, a method to estimate the control parameters of the the model by Maeda from electromagnetic articulography (EMA) data, which allows the derivation of full sagittal vocal tract slices from sparse flesh-point information, is presented. First, the articulatory grid system involved in the model's definition is adapted to the speaker involved in the experiment, and EMA data are registered to it automatically. Then, articulatory variables that correspond to measurements defined by Maeda on the grid are extracted. An initial solution for the articulatory control parameters is found by a least-squares method, under constraints ensuring vocal tract shape naturalness. Dynamic smoothness of the parameter trajectories is then imposed by a variational regularization method. Generated vocal tract slices for vowels are compared with slices appearing in magnetic resonance images of the same speaker or found in the literature. Formants synthesized on the basis of these generated slices are adequately close to those tracked in real speech recorded concurrently with EMA.

Assuntos

Incisivo/fisiologia , Laringe/fisiologia , Lábio/fisiologia , Imageamento por Ressonância Magnética/métodos , Modelos Biológicos , Fonação/fisiologia , Língua/fisiologia , Antropometria , Orelha , Feminino , Humanos , Nariz , Palato/fisiologia , Faringe/fisiologia

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA