Pesquisa | Portal Regional da BVS (teste)

Comparative Analysis of Majority Language Influence on North Sámi Prosody Using WaveNet-Based modeling.

Hiovain, Katri; Suni, Antti; Kakouros, Sofoklis; Simko, Juraj.

Lang Speech ; 65(4): 859-888, 2022 Dec.

Artigo em Inglês | MEDLINE | ID: mdl-33375882

RESUMO

The Finnmark North Sámi is a variety of North Sámi language, an indigenous, endangered minority language spoken in the northernmost parts of Norway and Finland. The speakers of this language are bilingual, and regularly speak the majority language (Finnish or Norwegian) as well as their own North Sámi variety. In this paper we investigate possible influences of these majority languages on prosodic characteristics of Finnmark North Sámi, and associate them with prosodic patterns prevalent in the majority languages. We present a novel methodology that: (a) automatically finds the portions of speech (words) where the prosodic differences based on majority languages are most robustly manifested; and (b) analyzes the nature of these differences in terms of intonational patterns. For the first step, we trained convolutional WaveNet speech synthesis models on North Sámi speech material, modified to contain purely prosodic information, and used conditioning embeddings to find words with the greatest differences between the varieties. The subsequent exploratory analysis suggests that the differences in intonational patterns between the two Finnmark North Sámi varieties are not manifested uniformly across word types (based on part-of-speech category). Instead, we argue that the differences reflect phrase-level prosodic characteristics of the majority languages.

Assuntos

Idioma , Percepção da Fala , Humanos , Fala , Noruega

Cross-linguistic Influences on Sentence Accent Detection in Background Noise.

Scharenborg, Odette; Kakouros, Sofoklis; Post, Brechtje; Meunier, Fanny.

Lang Speech ; 63(1): 3-30, 2020 Mar.

Artigo em Inglês | MEDLINE | ID: mdl-30606083

RESUMO

This paper investigates whether sentence accent detection in a non-native language is dependent on (relative) similarity between prosodic cues to accent between the non-native and the native language, and whether cross-linguistic differences in the use of local and more widely distributed (i.e., non-local) cues to sentence accent detection lead to differential effects of the presence of background noise on sentence accent detection in a non-native language. We compared Dutch, Finnish, and French non-native listeners of English, whose cueing and use of prosodic prominence is gradually further removed from English, and compared their results on a phoneme monitoring task in different levels of noise and a quiet condition to those of native listeners. Overall phoneme detection performance was high for the native and the non-native listeners, but deteriorated to the same extent in the presence of background noise. Crucially, relative similarity between the prosodic cues to sentence accent of one's native language compared to that of a non-native language does not determine the ability to perceive and use sentence accent for speech perception in that non-native language. Moreover, proficiency in the non-native language is not a straightforward predictor of sentence accent perception performance, although high proficiency in a non-native language can seemingly overcome certain differences at the prosodic level between the native and non-native language. Instead, performance is determined by the extent to which listeners rely on local cues (English and Dutch) versus cues that are more distributed (Finnish and French), as more distributed cues survive the presence of background noise better.

Assuntos

Multilinguismo , Mascaramento Perceptivo , Fonética , Reconhecimento Psicológico , Percepção da Fala , Estimulação Acústica , Adulto , Sinais (Psicologia) , Feminino , Humanos , Idioma , Masculino , Ruído , Acústica da Fala , Inteligibilidade da Fala , Adulto Jovem

Is infant-directed speech interesting because it is surprising? - Linking properties of IDS to statistical learning and attention at the prosodic level.

Räsänen, Okko; Kakouros, Sofoklis; Soderstrom, Melanie.

Cognition ; 178: 193-206, 2018 09.

Artigo em Inglês | MEDLINE | ID: mdl-29885600

RESUMO

The exaggerated intonation and special rhythmic properties of infant-directed speech (IDS) have been hypothesized to attract infants' attention to the speech stream. However, there has been little work actually connecting the properties of IDS to models of attentional processing or perceptual learning. A number of such attention models suggest that surprising or novel perceptual inputs attract attention, where novelty can be operationalized as the statistical (un)predictability of the stimulus in the given context. Since prosodic patterns such as F0 contours are accessible to young infants who are also known to be adept statistical learners, the present paper investigates a hypothesis that F0 contours in IDS are less predictable than those in adult-directed speech (ADS), given previous exposure to both speaking styles, thereby potentially tapping into basic attentional mechanisms of the listeners in a similar manner that relative probabilities of other linguistic patterns are known to modulate attentional processing in infants and adults. Computational modeling analyses with naturalistic IDS and ADS speech from matched speakers and contexts show that IDS intonation has lower overall temporal predictability even when the F0 contours of both speaking styles are normalized to have equal means and variances. A closer analysis reveals that there is a tendency of IDS intonation to be less predictable at the end of short utterances, whereas ADS exhibits more stable average predictability patterns across the full extent of the utterances. The difference between IDS and ADS persists even when the proportion of IDS and ADS exposure is varied substantially, simulating different relative amounts of IDS heard in different family and cultural environments. Exposure to IDS is also found to be more efficient for predicting ADS intonation contours in new utterances than exposure to the equal amount of ADS speech. This indicates that the more variable prosodic contours of IDS also generalize to ADS, and may therefore enhance prosodic learning in infancy. Overall, the study suggests that one reason behind infant preference for IDS could be its higher information value at the prosodic level, as measured by the amount of surprisal in the F0 contours. This provides the first formal link between the properties of IDS and the models of attentional processing and statistical learning in the brain. However, this finding does not rule out the possibility that other differences between the IDS and ADS also play a role.

Assuntos

Atenção , Aprendizagem , Percepção da Fala , Fala , Estimulação Acústica , Humanos , Lactente , Desenvolvimento da Linguagem , Fonética , Acústica da Fala

Making predictable unpredictable with style - Behavioral and electrophysiological evidence for the critical role of prosodic expectations in the perception of prominence in speech.

Kakouros, Sofoklis; Salminen, Nelli; Räsänen, Okko.

Neuropsychologia ; 109: 181-199, 2018 01 31.

Artigo em Inglês | MEDLINE | ID: mdl-29247667

RESUMO

Perceptual prominence of linguistic units such as words has been earlier connected to the concepts of predictability and attentional orientation. One hypothesis is that low-probability prosodic or lexical content is perceived as prominent due to the surprisal and high information value associated with the stimulus. However, the existing behavioral studies have used stimulus manipulations that follow or violate typical linguistic patterns present in the listeners' native language, i.e., assuming that the listeners have already established a model for acceptable prosodic patterns in the language. In the present study, we investigated whether prosodic expectations and the resulting subjective impression of prominence is affected by brief statistical adaptation to suprasegmental acoustic features in speech, also in the case where the prosodic patterns do not necessarily follow language-typical marking for prominence. We first exposed listeners to five minutes of speech with uneven distributions of falling and rising fundamental frequency (F0) trajectories on sentence-final words, and then tested their judgments of prominence on a set of new utterances. The results show that the probability of the F0 trajectory affects the perception of prominence, a less frequent F0 trajectory making a word more prominent independently of the absolute direction of F0 change. In the second part of the study, we conducted EEG-measurements on a set of new subjects listening to similar utterances with predominantly rising or falling F0 on sentence-final words. Analysis of the resulting event-related potentials (ERP) reveals a significant difference in N200 and N400 ERP-component amplitudes between standard and deviant prosody, again independently of the F0 direction and the underlying lexical content. Since N400 has earlier been associated with semantic processing of stimuli, this suggests that listeners implicitly track probabilities at the suprasegmental level and that predictability of a prosodic pattern during a word has an impact to the semantic processing of the word. Overall, the study suggests that prosodic markers for prominence are at least partially driven by the statistical structure of recently perceived speech, and therefore prominence perception could be based on statistical learning mechanisms similar to those observed in early word learning, but in this case operating at the level of suprasegmental acoustic features.

Assuntos

Antecipação Psicológica/fisiologia , Encéfalo/fisiologia , Psicolinguística , Semântica , Acústica da Fala , Percepção da Fala/fisiologia , Adulto , Eletroencefalografia , Potenciais Evocados , Feminino , Humanos , Julgamento/fisiologia , Masculino , Modelos Psicológicos , Aprendizagem por Probabilidade

Perception of Sentence Stress in Speech Correlates With the Temporal Unpredictability of Prosodic Features.

Kakouros, Sofoklis; Räsänen, Okko.

Cogn Sci ; 40(7): 1739-1774, 2016 09.

Artigo em Inglês | MEDLINE | ID: mdl-26481111

RESUMO

Numerous studies have examined the acoustic correlates of sentential stress and its underlying linguistic functionality. However, the mechanism that connects stress cues to the listener's attentional processing has remained unclear. Also, the learnability versus innateness of stress perception has not been widely discussed. In this work, we introduce a novel perspective to the study of sentential stress and put forward the hypothesis that perceived sentence stress in speech is related to the unpredictability of prosodic features, thereby capturing the attention of the listener. As predictability is based on the statistical structure of the speech input, the hypothesis also suggests that stress perception is a result of general statistical learning mechanisms. To study this idea, computational simulations are performed where temporal prosodic trajectories are modeled with an n-gram model. Probabilities of the feature trajectories are subsequently evaluated on a set of novel utterances and compared to human perception of stress. The results show that the low-probability regions of F0 and energy trajectories are strongly correlated with stress perception, giving support to the idea that attention and unpredictability of sensory stimulus are mutually connected.

Assuntos

Atenção/fisiologia , Aprendizagem/fisiologia , Percepção da Fala/fisiologia , Adulto , Feminino , Humanos , Idioma , Masculino , Pessoa de Meia-Idade , Modelos Teóricos , Adulto Jovem

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA