Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 10 de 10
Filter
Add more filters










Publication year range
1.
Cogn Sci ; 48(4): e13439, 2024 04.
Article in English | MEDLINE | ID: mdl-38605452

ABSTRACT

Languages show substantial variability between their speakers, but it is currently unclear how the structure of the communicative network contributes to the patterning of this variability. While previous studies have highlighted the role of network structure in language change, the specific aspects of network structure that shape language variability remain largely unknown. To address this gap, we developed a Bayesian agent-based model of language evolution, contrasting between two distinct scenarios: language change and language emergence. By isolating the relative effects of specific global network metrics across thousands of simulations, we show that global characteristics of network structure play a critical role in shaping interindividual variation in language, while intraindividual variation is relatively unaffected. We effectively challenge the long-held belief that size and density are the main network structural factors influencing language variation, and show that path length and clustering coefficient are the main factors driving interindividual variation. In particular, we show that variation is more likely to occur in populations where individuals are not well-connected to each other. Additionally, variation is more likely to emerge in populations that are structured in small communities. Our study provides potentially important insights into the theoretical mechanisms underlying language variation.


Subject(s)
Communication , Language , Humans , Bayes Theorem
2.
J Acoust Soc Am ; 153(4): 2285, 2023 04 01.
Article in English | MEDLINE | ID: mdl-37092935

ABSTRACT

Acoustic variation is central to the study of speaker characterization. In this respect, specific phonemic classes such as vowels have been particularly studied, compared to fricatives. Fricatives exhibit important aperiodic energy, which can extend over a high-frequency range beyond that conventionally considered in phonetic analyses, often limited up to 12 kHz. We adopt here an extended frequency range up to 20.05 kHz to study a corpus of 15 812 fricatives produced by 59 speakers in Russian, a language offering a rich inventory of fricatives. We extracted two sets of parameters: the first is composed of 11 parameters derived from the frequency spectrum and duration (acoustic set) while the second is composed of 13 mel frequency cepstral coefficients (MFCCs). As a first step, we implemented machine learning methods to evaluate the potential of each set to predict gender and speaker identity. We show that gender can be predicted with a good performance by the acoustic set and even more so by MFCCs (accuracy of 0.72 and 0.88, respectively). MFCCs also predict individuals to some extent (accuracy = 0.64) unlike the acoustic set. In a second step, we provide a detailed analysis of the observed intra- and inter-speaker acoustic variation.


Subject(s)
Phonetics , Speech Acoustics , Humans , Acoustics , Language , Russia
3.
PLoS Comput Biol ; 19(4): e1010325, 2023 04.
Article in English | MEDLINE | ID: mdl-37053268

ABSTRACT

Despite the accumulation of data and studies, deciphering animal vocal communication remains challenging. In most cases, researchers must deal with the sparse recordings composing Small, Unbalanced, Noisy, but Genuine (SUNG) datasets. SUNG datasets are characterized by a limited number of recordings, most often noisy, and unbalanced in number between the individuals or categories of vocalizations. SUNG datasets therefore offer a valuable but inevitably distorted vision of communication systems. Adopting the best practices in their analysis is essential to effectively extract the available information and draw reliable conclusions. Here we show that the most recent advances in machine learning applied to a SUNG dataset succeed in unraveling the complex vocal repertoire of the bonobo, and we propose a workflow that can be effective with other animal species. We implement acoustic parameterization in three feature spaces and run a Supervised Uniform Manifold Approximation and Projection (S-UMAP) to evaluate how call types and individual signatures cluster in the bonobo acoustic space. We then implement three classification algorithms (Support Vector Machine, xgboost, neural networks) and their combination to explore the structure and variability of bonobo calls, as well as the robustness of the individual signature they encode. We underscore how classification performance is affected by the feature set and identify the most informative features. In addition, we highlight the need to address data leakage in the evaluation of classification performance to avoid misleading interpretations. Our results lead to identifying several practical approaches that are generalizable to any other animal communication system. To improve the reliability and replicability of vocal communication studies with SUNG datasets, we thus recommend: i) comparing several acoustic parameterizations; ii) visualizing the dataset with supervised UMAP to examine the species acoustic space; iii) adopting Support Vector Machines as the baseline classification approach; iv) explicitly evaluating data leakage and possibly implementing a mitigation strategy.


Subject(s)
Algorithms , Pan paniscus , Animals , Workflow , Reproducibility of Results , Neural Networks, Computer
4.
Cognition ; 232: 105345, 2023 03.
Article in English | MEDLINE | ID: mdl-36462227

ABSTRACT

Humans are expert at processing speech but how this feat is accomplished remains a major question in cognitive neuroscience. Capitalizing on the concept of channel capacity, we developed a unified measurement framework to investigate the respective influence of seven acoustic and linguistic features on speech comprehension, encompassing acoustic, sub-lexical, lexical and supra-lexical levels of description. We show that comprehension is independently impacted by all these features, but at varying degrees and with a clear dominance of the syllabic rate. Comparing comprehension of French words and sentences further reveals that when supra-lexical contextual information is present, the impact of all other features is dramatically reduced. Finally, we estimated the channel capacity associated with each linguistic feature and compared them with their generic distribution in natural speech. Our data reveal that while acoustic modulation, syllabic and phonemic rates unfold respectively at 5, 5, and 12 Hz in natural speech, they are associated with independent processing bottlenecks whose channel capacity are of 15, 15 and 35 Hz, respectively, as suggested by neurophysiological theories. They moreover point towards supra-lexical contextual information as the feature limiting the flow of natural speech. Overall, this study reveals how multilevel linguistic features constrain speech comprehension.


Subject(s)
Speech Perception , Speech , Humans , Speech/physiology , Comprehension/physiology , Speech Perception/physiology , Linguistics , Language
5.
J Acoust Soc Am ; 150(3): 1806, 2021 09.
Article in English | MEDLINE | ID: mdl-34598630

ABSTRACT

This paper shows that machine learning techniques are very successful at classifying the Russian voiceless non-palatalized fricatives [f], [s], and [ʃ] using a small set of acoustic cues. From a data sample of 6320 tokens of read sentences produced by 40 participants, temporal and spectral measurements are extracted from the full sound, the noise duration, and the middle 30 ms windows. Furthermore, 13 mel-frequency cepstral coefficients (MFCCs) are computed from the middle 30 ms window. Classifiers based on single decision trees, random forests, support vector machines, and neural networks are trained and tested to distinguish between these three fricatives. The results demonstrate that, first, the three acoustic cue extraction techniques are similar in terms of classification accuracy (93% and 99%) but that the spectral measurements extracted from the full frication noise duration result in slightly better accuracy. Second, the center of gravity and the spectral spread are sufficient for the classification of [f], [s], and [ʃ] irrespective of contextual and speaker variation. Third, MFCCs show a marginally higher predictive power over spectral cues (<2%). This suggests that both sets of measures provide sufficient information for the classification of these fricatives and their choice depends on the particular research question or application.


Subject(s)
Cues , Speech Acoustics , Acoustics , Humans , Russia , Support Vector Machine
6.
Front Psychol ; 12: 626118, 2021.
Article in English | MEDLINE | ID: mdl-34234707

ABSTRACT

Treating the speech communities as homogeneous entities is not an accurate representation of reality, as it misses some of the complexities of linguistic interactions. Inter-individual variation and multiple types of biases are ubiquitous in speech communities, regardless of their size. This variation is often neglected due to the assumption that "majority rules," and that the emerging language of the community will override any such biases by forcing the individuals to overcome their own biases, or risk having their use of language being treated as "idiosyncratic" or outright "pathological." In this paper, we use computer simulations of Bayesian linguistic agents embedded in communicative networks to investigate how biased individuals, representing a minority of the population, interact with the unbiased majority, how a shared language emerges, and the dynamics of these biases across time. We tested different network sizes (from very small to very large) and types (random, scale-free, and small-world), along with different strengths and types of bias (modeled through the Bayesian prior distribution of the agents and the mechanism used for generating utterances: either sampling from the posterior distribution ["sampler"] or picking the value with the maximum probability ["MAP"]). The results show that, while the biased agents, even when being in the minority, do adapt their language by going against their a priori preferences, they are far from being swamped by the majority, and instead the emergent shared language of the whole community is influenced by their bias.

7.
Sci Rep ; 10(1): 15431, 2020 09 22.
Article in English | MEDLINE | ID: mdl-32963261

ABSTRACT

Body postures are essential in animal behavioural repertoires and their communicative role has been assessed in a wide array of taxa and contexts. Some body postures function as amplifiers, a class of signals that increase the detection likelihood of other signals. While foraging on the ground, bonobos (Pan paniscus) can adopt different crouching postures exposing more or less of their genital area. To our knowledge, their potential functional role in the sociosexual life of bonobos has not been assessed yet. Here we show, by analysing more than 2,400 foraging events in 21 captive bonobos, that mature females adopt a rear-exposing posture (forelimb-crouch) and do so significantly more often when their anogenital region is swollen than during the non-swollen phase. In contrast, mature males almost completely avoid this posture. Moreover, this strong difference results from a diverging ontogeny between males and females since immature males and females adopt the forelimb-crouch at similar frequencies. Our findings suggest that the forelimb-crouch posture may play a communicative role of amplification by enhancing the visibility of female sexual swellings, a conspicuous signal that is very attractive for both males and females. Given the high social relevance of this sexual signal, our study emphasizes that postural signalling in primates probably deserves more attention, even outside of reproductive contexts.


Subject(s)
Behavior, Animal/physiology , Pan paniscus/physiology , Pan troglodytes/physiology , Posture/physiology , Animals , Communication , Female , Male , Social Behavior
8.
Sci Adv ; 5(9): eaaw2594, 2019 09.
Article in English | MEDLINE | ID: mdl-32047854

ABSTRACT

Language is universal, but it has few indisputably universal characteristics, with cross-linguistic variation being the norm. For example, languages differ greatly in the number of syllables they allow, resulting in large variation in the Shannon information per syllable. Nevertheless, all natural languages allow their speakers to efficiently encode and transmit information. We show here, using quantitative methods on a large cross-linguistic corpus of 17 languages, that the coupling between language-level (information per syllable) and speaker-level (speech rate) properties results in languages encoding similar information rates (~39 bits/s) despite wide differences in each property individually: Languages are more similar in information rates than in Shannon information or speech rate. These findings highlight the intimate feedback loops between languages' structural properties and their speakers' neurocognition and biology under communicative pressures. Thus, language is the product of a multiscale communicative niche construction process at the intersection of biology, environment, and culture.


Subject(s)
Communication , Language , Heterogeneous-Nuclear Ribonucleoproteins , Humans , Linguistics , Speech
9.
Front Psychol ; 5: 182, 2014.
Article in English | MEDLINE | ID: mdl-24723896

ABSTRACT

Writing words in real life involves setting objectives, imagining a recipient, translating ideas into linguistic forms, managing grapho-motor gestures, etc. Understanding writing requires observation of the processes as they occur in real time. Analysis of pauses is one of the preferred methods for accessing the dynamics of writing and is based on the idea that pauses are behavioral correlates of cognitive processes. However, there is a need to clarify what we are observing when studying pause phenomena, as we will argue in the first section. This taken into account, the study of pause phenomena can be considered following two approaches. A first approach, driven by temporality, would define a threshold and observe where pauses, e.g., scriptural inactivity occurs. A second approach, linguistically driven, would define structural units and look for scriptural inactivity at the boundaries of these units or within these units. Taking a temporally driven approach, we present two methods which aim at the automatic identification of scriptural inactivity which is most likely not attributable to grapho-motor management in texts written by children and adolescents using digitizing tablets in association with Eye and Pen (©) (Chesnet and Alamargot, 2005). The first method is purely statistical and is based on the idea that the distribution of pauses exhibits different Gaussian components each of them corresponding to a different type of pause. After having reviewed the limits of this statistical method, we present a second method based on writing dynamics which attempts to identify breaking points in the writing dynamics rather than relying only on pause duration. This second method needs to be refined to overcome the fact that calculation is impossible when there is insufficient data which is often the case when working with young scriptors.

10.
J Speech Lang Hear Res ; 52(4): 827-38, 2009 Aug.
Article in English | MEDLINE | ID: mdl-18971288

ABSTRACT

PURPOSE: This study investigates the ability to understand degraded speech signals and explores the correlation between this capacity and the functional characteristics of the peripheral auditory system. METHOD: The authors evaluated the capability of 50 normal-hearing native French speakers to restore time-reversed speech. The task required them to transcribe two-syllable items containing temporal reversions of variable sizes, ranging from no reversion to complete reversion, increasing by half-syllable steps. In parallel, the functionality of each participant's auditory efferent system was evaluated using contralateral suppression of click-evoked otoacoustic emissions. RESULTS: Perceptual accuracy for time-reversed speech diminished when the size of the applied temporal distortion increased. A lexical benefit was evident, and an important interindividual variability in performance was observed. Functional exploration of the auditory system revealed that speech restoration performances correlated with the suppression strength of the participant's auditory efferent system. CONCLUSIONS: These results suggest a clear relation between the functional asymmetry of the auditory efferent pathway (the right-side activity is greater than the left-side activity in right-handed participants) and the comprehension of acoustically distorted speech in normal-hearing participants. Further experiments are needed to better specify how the functionality of the medial olivocochlear bundle can cause phonological activation to be more efficient.


Subject(s)
Comprehension/physiology , Speech Perception/physiology , Speech , Acoustic Stimulation , Adolescent , Adult , Analysis of Variance , Female , Functional Laterality , Humans , Male , Otoacoustic Emissions, Spontaneous , Phonetics , Psychoacoustics , Time Factors , Vocabulary , Young Adult
SELECTION OF CITATIONS
SEARCH DETAIL
...