Pesquisa | Portal Regional da BVS (teste)

The case for aural perceptual speaker identification.

Hollien, Harry; Didla, Grace; Harnsberger, James D; Hollien, Keith A.

Forensic Sci Int ; 269: 8-20, 2016 Dec.

Artigo em Inglês | MEDLINE | ID: mdl-27855301

RESUMO

Once forensic speaker identification (SI) was recognized as an entity, it was predicted that valid computer based identification systems would quickly become a reality. This has not happened and the review to follow will provide some of the reasons why. Notable among them are (1) the sharp underestimation of its complexity and (2) its confounding with speaker verification (SV). Consideration of these (and related) issues will be followed by a brief history about how the need for SI developed and some of the responses to the problem. Since much of the SI development preceded the structuring of appropriate standards, the recommended stop-gap response described here is based on somewhat uncoordinated, but extensive, research. The product of that effort will be reviewed and organized into a platform which supports SI procedures consistent with the forensic model. Also discussed are the standards which have been established, their impact on SI development and its present limitations. How the cited approach interacts both with progress in verification and the developing SI machine-based identification systems also will be considered. Finally, a few suggestions will be made that should assist in upgrading the effectiveness of aural perceptual speaker identification (AP SI).

Assuntos

Ciências Forenses/métodos , Fonética , Acústica da Fala , Inteligibilidade da Fala , Percepção da Fala , Acústica , Prova Pericial , Humanos , Qualidade da Voz

Evaluating language environment analysis system performance for Chinese: a pilot study in Shanghai.

Gilkerson, Jill; Zhang, Yiwen; Xu, Dongxin; Richards, Jeffrey A; Xu, Xiaojuan; Jiang, Fan; Harnsberger, James; Topping, Keith.

J Speech Lang Hear Res ; 58(2): 445-52, 2015 Apr.

Artigo em Inglês | MEDLINE | ID: mdl-25614978

RESUMO

PURPOSE: The purpose of this study was to evaluate performance of the Language Environment Analysis (LENA) automated language-analysis system for the Chinese Shanghai dialect and Mandarin (SDM) languages. METHOD: Volunteer parents of 22 children aged 3-23 months were recruited in Shanghai. Families provided daylong in-home audio recordings using LENA. A native speaker listened to 15 min of randomly selected audio samples per family to label speaker regions and provide Chinese character and SDM word counts for adult speakers. LENA segment labeling and counts were compared with rater-based values. RESULTS: LENA demonstrated good sensitivity in identifying adult and child; this sensitivity was comparable to that of American English validation samples. Precision was strong for adults but less so for children. LENA adult word count correlated strongly with both Chinese characters and SDM word counts. LENA conversational turn counts correlated similarly with rater-based counts after the exclusion of three unusual samples. Performance related to some degree to child age. CONCLUSIONS: LENA adult word count and conversational turn provided reasonably accurate estimates for SDM over the age range tested. Theoretical and practical considerations regarding LENA performance in non-English languages are discussed. Despite the pilot nature and other limitations of the study, results are promising for broader cross-linguistic applications.

Assuntos

Meio Ambiente , Idioma , Medida da Produção da Fala/métodos , Comportamento Verbal , Aprendizagem Verbal , Adulto , China , Feminino , Humanos , Lactente , Testes de Linguagem , Masculino , Pais , Projetos Piloto , Percepção da Fala

Issues in forensic voice.

Hollien, Harry; Huntley Bahr, Ruth; Harnsberger, James D.

J Voice ; 28(2): 170-84, 2014 Mar.

Artigo em Inglês | MEDLINE | ID: mdl-24176301

RESUMO

The following article provides a general review of an area that can be referred to as Forensic Voice. Its goals will be outlined and that discussion will be followed by a description of its major elements. Considered are (1) the processing and analysis of spoken utterances, (2) distorted speech, (3) enhancement of speech intelligibility (re: surveillance and other recordings), (4) transcripts, (5) authentication of recordings, (6) speaker identification, and (7) the detection of deception, intoxication, and emotions in speech. Stress in speech and the psychological stress evaluation systems (that some individuals attempt to use as lie detectors) also will be considered. Points of entry will be suggested for individuals with the kinds of backgrounds possessed by professionals already working in the voice area.

Assuntos

Acústica , Ciências Forenses/métodos , Acústica da Fala , Inteligibilidade da Fala , Percepção da Fala , Medida da Produção da Fala/métodos , Qualidade da Voz , Humanos , Fonética , Psicoacústica , Processamento de Sinais Assistido por Computador , Espectrografia do Som

Noise and tremor in the perception of vocal aging in males.

Harnsberger, James D; Brown, William S; Shrivastav, Rahul; Rothman, Howard.

J Voice ; 24(5): 523-30, 2010 Sep.

Artigo em Inglês | MEDLINE | ID: mdl-19815378

RESUMO

OBJECTIVE/HYPOTHESIS: To specify a set of acoustic cues for vocal aging and to establish their perceptual relevance. STUDY DESIGN: Perceptual testing. METHODS: To identify the acoustic and perceptual correlates of the aging voice, voice quality [in conjunction with speaking rate and fundamental frequency (F(0))] was systematically manipulated using resynthesis to determine its effect on perceived age. Ten young male voices were resynthesized using two levels of noise (random modulation of F(0) contour) and two levels of tremor (constant modulation of F(0) contour with a low-amplitude wave) under a speaking-rate manipulation (an increase in speaking rate that is common to older male voices). These materials were submitted to 40 naive listeners in an age-estimation task. Two sets of comparison materials were also included for evaluation: unmanipulated samples from a 150 voice database of young, middle-aged, and older voices and disordered voice samples representing natural manifestations of the voice qualities of interest. RESULTS: Speaking rate, highest degree of tremor, and highest degree of noise all shifted, in an additive manner, the mean perceived age of the young male voices by a maximum of 12 years on average; individual voices were observed being shifted by a generation. Fundamental frequency manipulations had no significant effect on perceived age. CONCLUSIONS: Voice quality (both tremor and noise) and speaking rate are all perceptually relevant cues of age in male voices.

Assuntos

Envelhecimento , Sinais (Psicologia) , Acústica da Fala , Percepção da Fala , Prega Vocal/fisiopatologia , Distúrbios da Voz/fisiopatologia , Qualidade da Voz , Estimulação Acústica , Adolescente , Adulto , Fatores Etários , Idoso , Idoso de 80 Anos ou mais , Feminino , Florida , Humanos , Masculino , Pessoa de Meia-Idade , Fonética , Fatores de Tempo , Distúrbios da Voz/etiologia , Distúrbios da Voz/psicologia , Adulto Jovem

Perceiving the effects of ethanol intoxication on voice.

Hollien, Harry; Harnsberger, James D; Martin, Camilo A; Hill, Rebecca; Alderman, G Allan.

J Voice ; 23(5): 552-9, 2009 Sep.

Artigo em Inglês | MEDLINE | ID: mdl-19535221

RESUMO

Many conditions operate to degrade the quality of the human voice. Alcohol intoxication is one of them. In this project, the objectives were to examine the ability of human listeners to accurately estimate both the presence and severity of intoxication from two types of speech samples. A review of available data suggests that, although listeners can often identify individuals who are intoxicated simply by hearing samples of their voice, they are less efficient at accurately determining the severity of this condition. A number of aural-perceptual studies were carried out to test these relationships. Populations of speakers, selected based on rigorous criteria, provided orally read and extemporaneous utterances when sober and at three highly controlled levels of intoxication. Listener groups of university students and professionals attempted to identify both the existence and specific level of intoxication present. It was found that these individuals were proficient in recognizing the presence of, and increases in, intoxication but were less accurate in gauging the specific levels. Several subordinate relationships were also investigated. In this regard, statistically significant differences were not found between male and female listeners or between professionals and lay listeners; however, they were found for different classes of speech. That is, it was shown that text difficulty correlated with severity of effect.

Assuntos

Intoxicação Alcoólica , Percepção da Fala , Qualidade da Voz , Adolescente , Adulto , Análise de Variância , Escolaridade , Etanol/administração & dosagem , Feminino , Humanos , Masculino , Psicoacústica , Leitura , Caracteres Sexuais , Fala , Adulto Jovem

Stress and deception in speech: evaluating layered voice analysis.

Harnsberger, James D; Hollien, Harry; Martin, Camilo A; Hollien, Kevin A.

J Forensic Sci ; 54(3): 642-50, 2009 May.

Artigo em Inglês | MEDLINE | ID: mdl-19432740

RESUMO

This study was designed to evaluate commonly used voice stress analyzers--in this case the layered voice analysis (LVA) system. The research protocol involved the use of a speech database containing materials recorded while highly controlled deception and stress levels were systematically varied. Subjects were 24 each males/females (age range 18-63 years) drawn from a diverse population. All held strong views about some issue; they were required to make intense contradictory statements while believing that they would be heard/seen by peers. The LVA system was then evaluated by means of a double blind study using two types of examiners: a pair of scientists trained and certified by the manufacturer in the proper use of the system and two highly experienced LVA instructors provided by this same firm. The results showed that the "true positive" (or hit) rates for all examiners averaged near chance (42-56%) for all conditions, types of materials (e.g., stress vs. unstressed, truth vs. deception), and examiners (scientists vs. manufacturers). Most importantly, the false positive rate was very high, ranging from 40% to 65%. Sensitivity statistics confirmed that the LVA system operated at about chance levels in the detection of truth, deception, and the presence of high and low vocal stress states.

Assuntos

Enganação , Medicina Legal/instrumentação , Detecção de Mentiras , Estresse Psicológico , Voz , Adolescente , Adulto , Método Duplo-Cego , Reações Falso-Negativas , Reações Falso-Positivas , Feminino , Humanos , Masculino , Pessoa de Meia-Idade , Processamento de Sinais Assistido por Computador , Fala

Evaluation of the NITV CVSA.

Hollien, Harry; Harnsberger, James D; Martin, Camilo A; Hollien, Kevin A.

J Forensic Sci ; 53(1): 183-93, 2008 Jan.

Artigo em Inglês | MEDLINE | ID: mdl-18279255

RESUMO

The purpose of this study was to evaluate a commonly used voice stress analyzer, the National Institute of Truth Verification's (NITV) Computer Voice Stress Analyzer (CVSA), using a speech database containing materials recorded (i) in the laboratory, while highly controlled deceptive and shock-induced stress levels were systematically varied, and (ii) during a field procedure. Subjects were 24 each males/females (age range 18-63 years) drawn from a representative population. All held strong views on an issue and were required to make sharply derogatory statements about it. The CVSA system was then evaluated in a double-blind study using three sets of examiners: (i) two UF scientists trained/certified by NITV in CVSA operation, (ii) three experienced NITV operators provided by the manufacturer and (iii) five experimental phoneticians. The results showed that the "true positive" (or hit) rates for all examiners ranged from chance to somewhat higher levels (c. 50-65%) for all conditions and types of materials (e.g., stress vs. unstressed, truth vs. deception). However, the false-positive rate was just as high - often higher. Sensitivity statistics demonstrated that the CVSA system operated at about chance level.

Assuntos

Medicina Legal/instrumentação , Detecção de Mentiras , Estresse Psicológico/diagnóstico , Voz , Adolescente , Adulto , Método Duplo-Cego , Reações Falso-Negativas , Reações Falso-Positivas , Feminino , Humanos , Masculino , Pessoa de Meia-Idade , Fonética , Competência Profissional , Processamento de Sinais Assistido por Computador

A new method for eliciting three speaking styles in the laboratory.

Harnsberger, James D; Wright, Richard; Pisoni, David B.

Speech Commun ; 50(4): 323-336, 2008 Apr 01.

Artigo em Inglês | MEDLINE | ID: mdl-19562041

RESUMO

In this study, a method was developed to elicit three different speaking styles, reduced, citation, and hyperarticulated, using controlled sentence materials in a laboratory setting. In the first set of experiments, the reduced style was elicited by having twelve talkers read a sentence while carrying out a distractor task that involved recalling from short-term memory an individually-calibrated number of digits. The citation style corresponded to read speech in the laboratory. The hyperarticulated style was elicited by prompting talkers (twice) to reread the sentences more carefully. The results of perceptual tests with naïve listeners and an acoustic analysis showed that six of the twelve talkers produced a reduced style of speech for the test sentences in the distractor task relative to the same sentences in the citation style condition. In addition, all talkers consistently produced sentences in the citation and hyperarticulated styles. In the second set of experiments, the reduced style was elicited by increasing the number of digits in the distractor task by one (a heavier cognitive load). The procedures for eliciting citation and hyperarticulated sentences remained unchanged. Ten talkers were recorded in the second experiment. The results showed that six out of ten talkers differentiated all three styles as predicted (70% of all sentences recorded). In addition, all talkers consistently produced sentences in the citation and hyperarticulated styles. Overall, the results demonstrate that it is possible to elicit controlled sentence stimulus materials varying in speaking style in a laboratory setting, although the method requires further refinement to elicit these styles more consistently from individual participants.

Speaking rate and fundamental frequency as speech cues to perceived age.

Harnsberger, James D; Shrivastav, Rahul; Brown, W S; Rothman, Howard; Hollien, Harry.

J Voice ; 22(1): 58-69, 2008 Jan.

Artigo em Inglês | MEDLINE | ID: mdl-16968663

RESUMO

This study aimed to specify a set of acoustic cues fundamental to vocal aging and to establish their perceptual relevance, using acoustic analysis and perceptual testing. Three experiments were conducted to identify the perceptual correlates of the aging voice. The first experiment analyzed important voice parameters that signal a person's age for 16 older males and 14 younger males. In the second and third experiments, these acoustic patterns were systematically shifted through resynthesis to see if perceived age would be significantly influenced. In the second experiment, the older and younger male voices were resynthesized by manipulating speaking rate and fundamental frequency to shift the perceived age of the groups toward each other. In the third experiment, older and middle-aged male voices were resynthesized in a similar manner. In both perceptual studies, an age estimation task with naive listeners was used. The results of the first experiment showed that, in older speakers, sentence, word, and diphthong durations were all significantly longer and mean fundamental frequency was significantly higher than for the younger group. In the second experiment, only the manipulation of speaking rate resulted in a significant shift in perceived age, and it did so only for the older subjects. In the third experiment, a significant shift in age estimates was observed for the middle-aged, but not the older, voices when speaking rate was manipulated. The results of both perception tests suggest that speaking rate, but possibly not fundamental frequency, is a perceptually relevant cue to age in voice.

Assuntos

Percepção , Percepção da Fala , Fala , Comportamento Verbal , Qualidade da Voz , Fatores Etários , Sinais (Psicologia) , Humanos , Acústica da Fala

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA