Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 8 de 8
Filter
Add more filters










Database
Language
Publication year range
1.
J Acoust Soc Am ; 153(3): 1867, 2023 03.
Article in English | MEDLINE | ID: mdl-37002095

ABSTRACT

In speech production, the anatomical morphology forms the substrate on which the speakers build their articulatory strategy to reach specific articulatory-acoustic goals. The aim of this study is to characterize morphological inter-speaker variability by building a shape model of the full vocal tract including hard and soft structures. Static magnetic resonance imaging data from 41 speakers articulating altogether 1947 phonemes were considered, and the midsagittal articulator contours were manually outlined. A phoneme-independent average-articulation representative of morphology was calculated as the speaker mean articulation. A principal component analysis-driven shape model was derived from average-articulations, leading to five morphological components, which explained 87% of the variance. Almost three-quarters of the variance was related to independent variations of the horizontal oral and vertical pharyngeal lengths, the latter capturing male-female differences. The three additional components captured shape variations related to head tilt and palate shape. Plane wave propagation acoustic simulations were run to characterize morphological components. A lengthening of 1 cm of the vocal tract in the vertical or horizontal directions led to a decrease in formant values of 7%-8%. Further analyses are required to analyze three-dimensional variability and to understand the morphological-acoustic relationships per phoneme. Average-articulations and model code are publicly available (https://github.com/tonioser/VTMorphologicalModel).


Subject(s)
Speech Acoustics , Voice , Male , Female , Humans , Phonetics , Speech , Acoustics
2.
Sensors (Basel) ; 22(8)2022 Apr 10.
Article in English | MEDLINE | ID: mdl-35458885

ABSTRACT

Cough is a very common symptom and the most frequent reason for seeking medical advice. Optimized care goes inevitably through an adapted recording of this symptom and automatic processing. This study provides an updated exhaustive quantitative review of the field of cough sound acquisition, automatic detection in longer audio sequences and automatic classification of the nature or disease. Related studies were analyzed and metrics extracted and processed to create a quantitative characterization of the state-of-the-art and trends. A list of objective criteria was established to select a subset of the most complete detection studies in the perspective of deployment in clinical practice. One hundred and forty-four studies were short-listed, and a picture of the state-of-the-art technology is drawn. The trend shows an increasing number of classification studies, an increase of the dataset size, in part from crowdsourcing, a rapid increase of COVID-19 studies, the prevalence of smartphones and wearable sensors for the acquisition, and a rapid expansion of deep learning. Finally, a subset of 12 detection studies is identified as the most complete ones. An unequaled quantitative overview is presented. The field shows a remarkable dynamic, boosted by the research on COVID-19 diagnosis, and a perfect adaptation to mobile health.


Subject(s)
COVID-19 , Crowdsourcing , COVID-19/diagnosis , COVID-19 Testing , Cough/diagnosis , Humans , Sound
3.
Sci Rep ; 10(1): 1468, 2020 01 30.
Article in English | MEDLINE | ID: mdl-32001739

ABSTRACT

The various speech sounds of a language are obtained by varying the shape and position of the articulators surrounding the vocal tract. Analyzing their variations is crucial for understanding speech production, diagnosing speech disorders and planning therapy. Identifying key anatomical landmarks of these structures on medical images is a pre-requisite for any quantitative analysis and the rising amount of data generated in the field calls for an automatic solution. The challenge lies in the high inter- and intra-speaker variability, the mutual interaction between the articulators and the moderate quality of the images. This study addresses this issue for the first time and tackles it by means of Deep Learning. It proposes a dedicated network architecture named Flat-net and its performance are evaluated and compared with eleven state-of-the-art methods from the literature. The dataset contains midsagittal anatomical Magnetic Resonance Images for 9 speakers sustaining 62 articulations with 21 annotated anatomical landmarks per image. Results show that the Flat-net approach outperforms the former methods, leading to an overall Root Mean Square Error of 3.6 pixels/0.36 cm obtained in a leave-one-out procedure over the speakers. The implementation codes are also shared publicly on GitHub.


Subject(s)
Anatomic Landmarks/diagnostic imaging , Magnetic Resonance Imaging , Speech , Anatomic Landmarks/anatomy & histology , Automation , Deep Learning , Epiglottis/anatomy & histology , Epiglottis/diagnostic imaging , Female , Glottis/anatomy & histology , Glottis/diagnostic imaging , Humans , Lip/anatomy & histology , Lip/diagnostic imaging , Male , Mouth/anatomy & histology , Mouth/diagnostic imaging , Nasopharynx/anatomy & histology , Nasopharynx/diagnostic imaging , Nose/anatomy & histology , Nose/diagnostic imaging , Tongue/anatomy & histology , Tongue/diagnostic imaging , Vocal Cords/anatomy & histology , Vocal Cords/diagnostic imaging , Voice
4.
J Acoust Soc Am ; 145(4): 2149, 2019 04.
Article in English | MEDLINE | ID: mdl-31046321

ABSTRACT

Speech communication relies on articulatory and acoustic codes shared between speakers and listeners despite inter-individual differences in morphology and idiosyncratic articulatory strategies. This study addresses the long-standing problem of characterizing and modelling speaker-independent articulatory strategies and inter-speaker articulatory variability. It explores a multi-speaker modelling approach based on two levels: statistically-based linear articulatory models, which capture the speaker-specific articulatory variability on the one hand, are in turn controlled by a speaker model, which captures the inter-speaker variability on the other hand. A low dimensionality speaker model is obtained by taking advantage of the inter-speaker correlations between morphology and strategy. To validate this approach, contours of the vocal tract articulators were manually segmented on midsagittal MRI data recorded from 11 French speakers uttering 62 vowels and consonants. Using these contours, multi-speaker models with 14 articulatory components and two morphology and strategy components led to overall variance explanations of 66%-69% and root-mean-square errors of 0.36-0.38 cm obtained in leave-one-out procedure over the speakers. Results suggest that inter-speaker variability is more related to the morphology than to the idiosyncratic strategies and illustrate the adaptation of the articulatory components to the morphology.


Subject(s)
Glottis/diagnostic imaging , Speech Acoustics , Speech Intelligibility , Adult , Biological Variation, Population , Female , Glottis/physiology , Humans , Magnetic Resonance Imaging , Male , Middle Aged , Models, Theoretical , Phonetics , Voice
5.
Comput Methods Biomech Biomed Engin ; 17(7): 768-86, 2014 May.
Article in English | MEDLINE | ID: mdl-22967113

ABSTRACT

In the context of patient-specific 3D bone reconstruction, enhancing the surface with cortical thickness (COT) opens a large field of applications for research and medicine. This functionality calls for database analysis for better knowledge of COT. Our study provides a new approach to reconstruct 3D internal and external cortical surfaces from computer tomography (CT) scans and analyses COT distribution and variability on a set of asymptomatic femurs. The reconstruction method relies on a short (∼5 min) initialisation phase based on 3D reconstruction from biplanar CT-based virtual X-rays and an automatic optimisation phase based on intensity-based cortical structure detection in the CT volume, the COT being the distance between internal and external cortical surfaces. Surfaces and COT show root mean square reconstruction errors below 1 and 1.3 mm. Descriptions of the COT distributions by anatomical regions are provided and principal component analysis has been applied. The first mode, 16-50% of the variance, corresponds to the variation of the mean COT around its averaged shape; the second mode, 9-28%, corresponds to a fine variation of its shape. A femur COT model can, therefore, be described as the averaged COT distribution in which the first parameter adjusts its mean value and a second parameter adjusts its shape.


Subject(s)
Femur/diagnostic imaging , Imaging, Three-Dimensional/methods , Tomography, X-Ray Computed/methods , Aged , Aged, 80 and over , Female , Humans , Male , Middle Aged , Principal Component Analysis
6.
Med Eng Phys ; 34(10): 1433-40, 2012 Dec.
Article in English | MEDLINE | ID: mdl-22349135

ABSTRACT

Three-dimensional (3D) reconstruction of the skeleton from biplanar X-rays relies on scarce information digitalised by an operator on both frontal and lateral radiographs. In clinical routine, difficulties occur for non-skilled operators to discriminate the medial from the lateral femur condyle on the lateral view. Our study proposes an algorithm able to detect automatically a possible inversion of the two condyles by the operator at an early stage of the reconstruction process. It relies on the computation of two 3D femur surfaces, one directly from the operator digitalisation and the other from the same digitalisation with medial and lateral condyles automatically swapped. Pairs of virtual biplanar X-rays are computed for both reconstructions and the closest pair to the original X-rays is selected on the basis of similarity measures, pointing the correct 3D surface. The algorithm shows a success rate higher than 85% for both asymptomatic and pathological femurs whatever the initial condyle digitalisation of the operator, bringing automatically non-skilled operators acting in clinical routine to the level of skilled operators. This study validates moreover the proof-of-concept of automatic shape adjustments of a 3D surface on the basis of similarity measures in the process of 3D reconstruction from biplanar X-rays.


Subject(s)
Femur/diagnostic imaging , Imaging, Three-Dimensional/methods , Radiography/methods , Adult , Algorithms , Biomechanical Phenomena , Female , Humans , Male , Middle Aged , Young Adult
7.
Philos Trans R Soc Lond B Biol Sci ; 367(1585): 88-102, 2012 Jan 12.
Article in English | MEDLINE | ID: mdl-22106429

ABSTRACT

Scientists seek to use fossil and archaeological evidence to constrain models of the coevolution of human language and tool use. We focus on Neanderthals, for whom indirect evidence from tool use and ancient DNA appears consistent with an adaptation to complex vocal-auditory communication. We summarize existing arguments that the articulatory apparatus for speech had not yet come under intense positive selection pressure in Neanderthals, and we outline some recent evidence and analyses that challenge such arguments. We then provide new anatomical results from our own attempt to reconstruct vocal tract (VT) morphology in Neanderthals, and document our simulations of the acoustic and articulatory potential of this reconstructed Neanderthal VT. Our purpose in this paper is not to polarize debate about whether or not Neanderthals were human-like in all relevant respects, but to contribute to the development of methods that can be used to make further incremental advances in our understanding of the evolution of speech based on fossil and archaeological evidence.


Subject(s)
Fossils , Larynx/anatomy & histology , Neanderthals/physiology , Speech Acoustics , Animals , Biological Evolution , Computer Simulation , Humans , Hyoid Bone/diagnostic imaging , Language , Larynx/physiology , Neanderthals/anatomy & histology , Regression Analysis , Selection, Genetic , Skull/diagnostic imaging , Speech , Tomography, X-Ray Computed
8.
J Acoust Soc Am ; 123(4): 2335-55, 2008 Apr.
Article in English | MEDLINE | ID: mdl-18397037

ABSTRACT

An original three-dimensional (3D) linear articulatory model of the velum and nasopharyngeal wall has been developed from magnetic resonance imaging (MRI) and computed tomography images of a French subject sustaining a set of 46 articulations, covering his articulatory repertoire. The velum and nasopharyngeal wall are represented by generic surface triangular meshes fitted to the 3D contours extracted from MRI for each articulation. Two degrees of freedom were uncovered by principal component analysis: first, VL accounts for 83% of the velum variance, corresponding to an oblique vertical movement seemingly related to the levator veli palatini muscle; second, VS explains another 6% of the velum variance, controlling a mostly horizontal movement possibly related to the sphincter action of the superior pharyngeal constrictor. The nasopharyngeal wall is also controlled by VL for 47% of its variance. Electromagnetic articulographic data recorded on the velum fitted these parameters exactly, and may serve to recover dynamic velum 3D shapes. The main oral and nasopharyngeal area functions controlled by the articulatory model, complemented by the area functions derived from the complex geometry of each nasal passage extracted from coronal MRIs, were fed to an acoustic model and gave promising results about the influence of velum movements on the spectral characteristics of nasals.


Subject(s)
Magnetic Resonance Imaging , Nasopharynx/anatomy & histology , Nasopharynx/diagnostic imaging , Palate, Soft/anatomy & histology , Palate, Soft/diagnostic imaging , Speech/physiology , Tomography, X-Ray Computed , Humans , Imaging, Three-Dimensional , Speech Production Measurement
SELECTION OF CITATIONS
SEARCH DETAIL
...