Pesquisa | Portal Regional da BVS (teste)

1.

Neural basis of language familiarity effects on voice recognition: An fNIRS study.

Meng, Yuan; Liang, Chunyan; Chen, Wenjing; Liu, Zhaoning; Yang, Chaoqing; Hu, Jiehui; Gao, Zhao; Gao, Shan.

Cortex ; 176: 1-10, 2024 Jul.

Artigo em Inglês | MEDLINE | ID: mdl-38723449

RESUMO

Recognizing talkers' identity via speech is an important social skill in interpersonal interaction. Behavioral evidence has shown that listeners can identify better the voices of their native language than those of a non-native language, which is known as the language familiarity effect (LFE). However, its underlying neural mechanisms remain unclear. This study therefore investigated how the LFE occurs at the neural level by employing functional near-infrared spectroscopy (fNIRS). Late unbalanced bilinguals were first asked to learn to associate strangers' voices with their identities and then tested for recognizing the talkers' identities based on their voices speaking a language either highly familiar (i.e., native language Chinese), or moderately familiar (i.e., second language English), or completely unfamiliar (i.e., Ewe) to participants. Participants identified talkers the most accurately in Chinese and the least accurately in Ewe. Talker identification was quicker in Chinese than in English and Ewe but reaction time did not differ between the two non-native languages. At the neural level, recognizing voices speaking Chinese relative to English/Ewe produced less activity in the inferior frontal gyrus, precentral/postcentral gyrus, supramarginal gyrus, and superior temporal sulcus/gyrus while no difference was found between English and Ewe, indicating facilitation of voice identification by the automatic phonological encoding in the native language. These findings shed new light on the interrelations between language ability and voice recognition, revealing that the brain activation pattern of the LFE depends on the automaticity of language processing.

Assuntos

Idioma , Reconhecimento Psicológico , Espectroscopia de Luz Próxima ao Infravermelho , Percepção da Fala , Voz , Humanos , Espectroscopia de Luz Próxima ao Infravermelho/métodos , Feminino , Masculino , Reconhecimento Psicológico/fisiologia , Adulto Jovem , Voz/fisiologia , Percepção da Fala/fisiologia , Adulto , Multilinguismo , Mapeamento Encefálico , Tempo de Reação/fisiologia , Encéfalo/fisiologia , Encéfalo/diagnóstico por imagem

2.

Using voice recognition to measure trust during interactions with automated vehicles.

Deng, Miaomiao; Chen, Jiaqi; Wu, Yue; Ma, Shu; Li, Hongting; Yang, Zhen; Shen, Yi.

Appl Ergon ; 116: 104184, 2024 Apr.

Artigo em Inglês | MEDLINE | ID: mdl-38048717

RESUMO

Trust in an automated vehicle system (AVs) can impact the experience and safety of drivers and passengers. This work investigates the effects of speech to measure drivers' trust in the AVs. Seventy-five participants were randomly assigned to high-trust (the AVs with 100% correctness, 0 crash, and 4 system messages with visual-auditory TORs) and low-trust group (the AVs with a correctness of 60%, a crash rate of 40%, 2 system messages with visual-only TORs). Voice interaction tasks were used to collect speech information during the driving process. The results revealed that our settings successfully induced trust and distrust states. The corresponding extracted speech feature data of the two trust groups were used for back-propagation neural network training and evaluated for its ability to accurately predict the trust classification. The highest classification accuracy of trust was 90.80%. This study proposes a method for accurately measuring trust in automated vehicles using voice recognition.

Assuntos

Condução de Veículo , Veículos Autônomos , Humanos , Automação , Reconhecimento de Voz , Confiança , Acidentes de Trânsito

3.

Reconhecimento de emoções pela voz e expressão facial por estudantes de medicina / Recognition of emotions by voice and facial expression by medical students

Zambeli, João Gabriel Antunes; Lira, Antonio Alexandre de Medeiros; Cassol, Mauriceia.

Audiol., Commun. res ; 29: e2889, 2024. tab

Artigo em Português | LILACS-Express | LILACS | ID: biblio-1557153

RESUMO

RESUMO Objetivo avaliar a capacidade de estudantes de medicina para reconhecer emoções pela voz e expressão facial, por meio de avaliações de percepção emocional da entonação vocal e das expressões faciais. Métodos estudo com delineamento transversal observacional. Para avaliação do reconhecimento de emoções pelas expressões faciais, utilizou-se um teste composto por 20 vídeos de microexpressões faciais e, para avaliação do reconhecimento emocional pela voz, utilizou-se o Protocolo de Reconhecimento de Emoções Prosódicas Básicas, baseado no banco de dados de Burkhardt. Para análise estatística, foram utilizados os testes de Friedman, Shapiro-Wilk, teste t de Student ou Mann-Whitney e o coeficiente de correlação de Pearson ou Spearman. Resultados o estudo foi composto por 38 alunos, com média de idade de 20,8 (±2,5). O reconhecimento de emoções pela voz foi significativamente superior, comparado com os resultados do teste de reconhecimento de emoções pelas expressões faciais. Houve correlação positiva entre a idade e a habilidade de reconhecer emoções pelas expressões faciais. O gênero masculino apresentou taxa significativa de acertos, superior ao gênero feminino na habilidade de reconhecer emoções pela expressão facial. As emoções com maior média de acertos pela expressão facial foram surpresa, alegria e desprezo, enquanto, por meio da voz, as emoções foram raiva, medo e tristeza. Conclusão a capacidade de reconhecimento de emoções por estudantes de medicina foi maior na avaliação de percepção emocional por meio da voz.

ABSTRACT Purpose To evaluate the ability of medical students to recognize emotions through voice and facial expression through assessments of emotional perception of vocal intonation and functional expressions. Methods Observational cross-sectional study. To evaluate the recognition of emotions by facial expressions, a test composed of 20 videos of facial microexpressions was used, and to evaluate the emotional recognition by voice, the protocol of prosodic impressions of basic emotions, based on the Burkhardt database, was used. For statistical analysis, the Friedman, Shapiro-Wilk, Student t, Mann-Whitney and Pearson or Spearman correlation coefficient tests were used. Results The study consisted of 38 students, with an average age of 20.8 (±2.5). The recognition of emotions through the voice was significantly superior to the one through facial expressions. There was a positive correlation between age and the ability to recognize emotions through facial expressions. Males had a significantly higher hit rate than females in the ability to recognize emotions through facial expression. The emotions with the highest average success rates through facial expression were surprise, joy and contempt, while, through the voice, the emotions were anger, fear and sadness. Conclusion The ability to recognize emotions by medical students was greater when assessing emotional perception through the voice.

4.

Healthcare's New Horizon With ChatGPT's Voice and Vision Capabilities: A Leap Beyond Text.

Temsah, Reem; Altamimi, Ibraheem; Alhasan, Khalid; Temsah, Mohamad-Hani; Jamal, Amr.

Cureus ; 15(10): e47469, 2023 Oct.

Artigo em Inglês | MEDLINE | ID: mdl-37873042

RESUMO

The integration of artificial intelligence (AI) in healthcare is responsible for a paradigm shift in medicine. OpenAI's recent augmentation of their Generative Pre-trained Transformer (ChatGPT) large language model (LLM) with voice and image recognition capabilities (OpenAI, Delaware) presents another potential transformative tool for healthcare. Envision a healthcare setting where professionals engage in dynamic interactions with ChatGPT to navigate the complexities of atypical medical scenarios. In this innovative landscape, practitioners could solicit ChatGPT's expertise for concise summarizations and insightful extrapolations from a myriad of web-based resources pertaining to similar medical conditions. Furthermore, imagine patients using ChatGPT to identify abnormalities in medical images or skin lesions. While the prospects are diverse, challenges such as suboptimal audio quality and ensuring data security necessitate cautious integration in medical practice. Drawing insights from previous ChatGPT iterations could provide a prudent roadmap for navigating possible challenges. This editorial explores some possible horizons and potential hurdles of ChatGPT's enhanced functionalities in healthcare, emphasizing the importance of continued refinements and vigilance to maximize the benefits while minimizing risks. Through collaborative efforts between AI developers and healthcare professionals, another fusion of AI and healthcare can evolve into enriched patient care and enhanced medical experience.

5.

Optimizing Voice Recognition Informatic Robots for Effective Communication in Outpatient Settings.

Meng, Zuowei; Liu, Hairong; Ma, Alfred C.

Cureus ; 15(9): e44848, 2023 Sep.

Artigo em Inglês | MEDLINE | ID: mdl-37809163

RESUMO

Aim/Objective Within the dynamic healthcare technology landscape, this research aims to explore patient inquiries within outpatient clinics, elucidating the interplay between technology and healthcare intricacies. Building upon the initial intelligent guidance robot implementation shortcomings, this investigation seeks to enhance informatic robots with voice recognition technology. The objective is to analyze users' vocal patterns, discern age-associated vocal attributes, and facilitate age differentiation through subtle vocal nuances to enhance the efficacy of human-robot communication within outpatient clinical settings. Methods This investigation employs a multi-faceted approach. It leverages voice recognition technology to analyze users' vocal patterns. A diverse dataset of voice samples from various age groups was collected. Acoustic features encompassing pitch, formant frequencies, spectral characteristics, and vocal tract length are extracted from the audio samples. The Mel Filterbank and Mel-Frequency Cepstral Coefficients (MFCCs) are employed for speech and audio processing tasks alongside machine learning algorithms to assess and match vocal patterns to age-related traits. Results The research reveals compelling outcomes. The incorporation of voice recognition technology contributes to a significant improvement in human-robot communication within outpatient clinical settings. Through accurate analysis of vocal patterns and age-related traits, informatic robots can differentiate age through nuanced verbal cues. This augmentation leads to enhanced contextual understanding and tailored responses, significantly advancing the efficiency of patient interactions with the robots. Conclusion Integrating voice recognition technology into informatic robots presents a noteworthy advancement in outpatient clinic settings. By enabling age differentiation through vocal nuances, this augmentation enhances the precision and relevance of responses. The study contributes to the ongoing discourse on the dynamic evolution of healthcare technology, underscoring the complex synergy between technological progression and the intricate realities within healthcare infrastructure. As healthcare continues to metamorphose, the seamless integration of voice recognition technology marks a pivotal stride in optimizing human-robot communication and elevating patient care within outpatient settings.

6.

Highly Conductive and Sensitive Wearable Strain Sensors with Metal/Nanoparticle Double Layer for Noninterference Voice Detection.

Choi, Hyung Jin; Ahn, Junhyuk; Jung, Byung Ku; Choi, Young Kyun; Park, Taesung; Bang, Junsung; Park, Junhyeok; Yang, Yoonji; Son, Gayeon; Oh, Soong Ju.

ACS Appl Mater Interfaces ; 15(36): 42836-42844, 2023 Sep 13.

Artigo em Inglês | MEDLINE | ID: mdl-37665133

RESUMO

Human voice recognition via skin-attachable devices has significant potential for gathering important physiological information from acoustic data without background noise interference. In this study, a highly sensitive and conductive wearable crack-based strain sensor was developed for voice-recognition systems. The sensor was fabricated using a double-layer structure of Ag nanoparticles (NPs) and Ag metal on a biocompatible polydimethylsiloxane substrate. The top metal layer acts as a conducting active layer, whereas the bottom Ag NP layer induces channel cracks in the upper layer, effectively hindering current flow. Subsequently, the double-layer film exhibits a low electrical resistance value (<5 × 10-5 Ω cm), ultrahigh sensitivity (gauge factor = 1870), and a fast response/recovery time (252/168 µs). A sound wave was detected at a high frequency of 15 kHz with a signal-to-noise ratio (SNR) over 40 dB. The sensor exhibited excellent anti-interference characteristics and effectively differentiated between different voice qualities (modal, pressed, and breathy), with a systematic analysis revealing successful detection of the laryngeal state and glottal source. This ultrasensitive wearable sensor has potential applications in various physiological signal measurement methods, personalized healthcare systems, and ubiquitous computing.

Assuntos

Nanopartículas Metálicas , Dispositivos Eletrônicos Vestíveis , Humanos , Prata , Condutividade Elétrica , Som

7.

Visual Deprivation Alters Functional Connectivity of Neural Networks for Voice Recognition: A Resting-State fMRI Study.

Pang, Wenbin; Zhou, Wei; Ruan, Yufang; Zhang, Linjun; Shu, Hua; Zhang, Yang; Zhang, Yumei.

Brain Sci ; 13(4)2023 Apr 07.

Artigo em Inglês | MEDLINE | ID: mdl-37190601

RESUMO

Humans recognize one another by identifying their voices and faces. For sighted people, the integration of voice and face signals in corresponding brain networks plays an important role in facilitating the process. However, individuals with vision loss primarily resort to voice cues to recognize a person's identity. It remains unclear how the neural systems for voice recognition reorganize in the blind. In the present study, we collected behavioral and resting-state fMRI data from 20 early blind (5 females; mean age = 22.6 years) and 22 sighted control (7 females; mean age = 23.7 years) individuals. We aimed to investigate the alterations in the resting-state functional connectivity (FC) among the voice- and face-sensitive areas in blind subjects in comparison with controls. We found that the intranetwork connections among voice-sensitive areas, including amygdala-posterior "temporal voice areas" (TVAp), amygdala-anterior "temporal voice areas" (TVAa), and amygdala-inferior frontal gyrus (IFG) were enhanced in the early blind. The blind group also showed increased FCs of "fusiform face area" (FFA)-IFG and "occipital face area" (OFA)-IFG but decreased FCs between the face-sensitive areas (i.e., FFA and OFA) and TVAa. Moreover, the voice-recognition accuracy was positively related to the strength of TVAp-FFA in the sighted, and the strength of amygdala-FFA in the blind. These findings indicate that visual deprivation shapes functional connectivity by increasing the intranetwork connections among voice-sensitive areas while decreasing the internetwork connections between the voice- and face-sensitive areas. Moreover, the face-sensitive areas are still involved in the voice-recognition process in blind individuals through pathways such as the subcortical-occipital or occipitofrontal connections, which may benefit the visually impaired greatly during voice processing.

8.

Neural Correlates of Voice Learning with Distinctive and Non-Distinctive Faces.

Zäske, Romi; Kaufmann, Jürgen M; Schweinberger, Stefan R.

Brain Sci ; 13(4)2023 Apr 07.

Artigo em Inglês | MEDLINE | ID: mdl-37190602

RESUMO

Recognizing people from their voices may be facilitated by a voice's distinctiveness, in a manner similar to that which has been reported for faces. However, little is known about the neural time-course of voice learning and the role of facial information in voice learning. Based on evidence for audiovisual integration in the recognition of familiar people, we studied the behavioral and electrophysiological correlates of voice learning associated with distinctive or non-distinctive faces. We repeated twelve unfamiliar voices uttering short sentences, together with either distinctive or non-distinctive faces (depicted before and during voice presentation) in six learning-test cycles. During learning, distinctive faces increased early visually-evoked (N170, P200, N250) potentials relative to non-distinctive faces, and face distinctiveness modulated voice-elicited slow EEG activity at the occipito-temporal and fronto-central electrodes. At the test, unimodally-presented voices previously learned with distinctive faces were classified more quickly than were voices learned with non-distinctive faces, and also more quickly than novel voices. Moreover, voices previously learned with faces elicited an N250-like component that was similar in topography to that typically observed for facial stimuli. The preliminary source localization of this voice-induced N250 was compatible with a source in the fusiform gyrus. Taken together, our findings provide support for a theory of early interaction between voice and face processing areas during both learning and voice recognition.

9.

Sound-Based Localization Using LSTM Networks for Visually Impaired Navigation.

Bakouri, Mohsen; Alyami, Naif; Alassaf, Ahmad; Waly, Mohamed; Alqahtani, Tariq; AlMohimeed, Ibrahim; Alqahtani, Abdulrahman; Samsuzzaman, Md; Ismail, Husham Farouk; Alharbi, Yousef.

Sensors (Basel) ; 23(8)2023 Apr 17.

Artigo em Inglês | MEDLINE | ID: mdl-37112374

RESUMO

In this work, we developed a prototype that adopted sound-based systems for localization of visually impaired individuals. The system was implemented based on a wireless ultrasound network, which helped the blind and visually impaired to navigate and maneuver autonomously. Ultrasonic-based systems use high-frequency sound waves to detect obstacles in the environment and provide location information to the user. Voice recognition and long short-term memory (LSTM) techniques were used to design the algorithms. The Dijkstra algorithm was also used to determine the shortest distance between two places. Assistive hardware tools, which included an ultrasonic sensor network, a global positioning system (GPS), and a digital compass, were utilized to implement this method. For indoor evaluation, three nodes were localized on the doors of different rooms inside the house, including the kitchen, bathroom, and bedroom. The coordinates (interactive latitude and longitude points) of four outdoor areas (mosque, laundry, supermarket, and home) were identified and stored in a microcomputer's memory to evaluate the outdoor settings. The results showed that the root mean square error for indoor settings after 45 trials is about 0.192. In addition, the Dijkstra algorithm determined that the shortest distance between two places was within an accuracy of 97%.

Assuntos

Tecnologia Assistiva , Pessoas com Deficiência Visual , Humanos , Sistemas de Informação Geográfica , Ultrassonografia , Algoritmos

10.

Lesions to Caudomedial Nidopallium Impair Individual Vocal Recognition in the Zebra Finch.

Yu, Kevin; Wood, William E; Johnston, Leah G; Theunissen, Frederic E.

J Neurosci ; 43(14): 2579-2596, 2023 04 05.

Artigo em Inglês | MEDLINE | ID: mdl-36859308

RESUMO

Many social animals can recognize other individuals by their vocalizations. This requires a memory system capable of mapping incoming acoustic signals to one of many known individuals. Using the zebra finch, a social songbird that uses songs and distance calls to communicate individual identity (Elie and Theunissen, 2018), we tested the role of two cortical-like brain regions in a vocal recognition task. We found that the rostral region of the Cadomedial Nidopallium (NCM), a secondary auditory region of the avian pallium, was necessary for maintaining auditory memories for conspecific vocalizations in both male and female birds, whereas HVC (used as a proper name), a premotor areas that gates auditory input into the vocal motor and song learning pathways in male birds (Roberts and Mooney, 2013), was not. Both NCM and HVC have previously been implicated for processing the tutor song in the context of song learning (Sakata and Yazaki-Sugiyama, 2020). Our results suggest that NCM might not only store songs as templates for future vocal imitation but also songs and calls for perceptual discrimination of vocalizers in both male and female birds. NCM could therefore operate as a site for auditory memories for vocalizations used in various facets of communication. We also observed that new auditory memories could be acquired without intact HVC or NCM but that for these new memories NCM lesions caused deficits in either memory capacity or auditory discrimination. These results suggest that the high-capacity memory functions of the avian pallial auditory system depend on NCM.SIGNIFICANCE STATEMENT Many aspects of vocal communication require the formation of auditory memories. Voice recognition, for example, requires a memory for vocalizers to identify acoustical features. In both birds and primates, the locus and neural correlates of these high-level memories remain poorly described. Previous work suggests that this memory formation is mediated by high-level sensory areas, not traditional memory areas such as the hippocampus. Using lesion experiments, we show that one secondary auditory brain region in songbirds that had previously been implicated in storing song memories for vocal imitation is also implicated in storing vocal memories for individual recognition. The role of the neural circuits in this region in interpreting the meaning of communication calls should be investigated in the future.

Assuntos

Tentilhões , Vocalização Animal , Animais , Masculino , Feminino , Estimulação Acústica , Aprendizagem , Encéfalo , Percepção Auditiva

11.

The Right Temporal Lobe and the Enhancement of Voice Recognition in Congenitally Blind Subjects.

Terruzzi, Stefano; Papagno, Costanza; Gainotti, Guido.

Brain Sci ; 13(3)2023 Mar 02.

Artigo em Inglês | MEDLINE | ID: mdl-36979241

RESUMO

BACKGROUND: Experimental investigations and clinical observations have shown that not only faces but also voices are predominantly processed by the right hemisphere. Moreover, right brain-damaged patients show more difficulties with voice than with face recognition. Finally, healthy subjects undergoing right temporal anodal stimulation improve their voice but not their face recognition. This asymmetry between face and voice recognition in the right hemisphere could be due to the greater complexity of voice processing. METHODS: To further investigate this issue, we tested voice and name recognition in twelve congenitally blind people. RESULTS: The results showed a complete overlap between the components of voice recognition impaired in patients with right temporal damage and those improved in congenitally blind people. Congenitally blind subjects, indeed, scored significantly better than control sighted individuals in voice discrimination and produced fewer false alarms on familiarity judgement of famous voices, corresponding to tests selectively impaired in patients with right temporal lesions. CONCLUSIONS: We suggest that task difficulty is a factor that impacts on the degree of its lateralization.

12.

Multichannel Gradient Piezoelectric Transducer Assisted with Deep Learning for Broadband Acoustic Sensing.

Lan, Boling; Yang, Tao; Tian, Guo; Ao, Yong; Jin, Long; Xiong, Da; Wang, Shenglong; Zhang, Hongrui; Deng, Lin; Sun, Yue; Zhang, Jieling; Deng, Weili; Yang, Weiqing.

ACS Appl Mater Interfaces ; 15(9): 12146-12153, 2023 Mar 08.

Artigo em Inglês | MEDLINE | ID: mdl-36811621

RESUMO

As an important part of human-machine interfaces, piezoelectric voice recognition has received extensive attention due to its unique self-powered nature. However, conventional voice recognition devices exhibit a limited response frequency band due to the intrinsic hardness and brittleness of piezoelectric ceramics or the flexibility of piezoelectric fibers. Here, we propose a cochlear-inspired multichannel piezoelectric acoustic sensor (MAS) based on gradient PVDF piezoelectric nanofibers for broadband voice recognition by a programmable electrospinning technique. Compared with the common electrospun PVDF membrane-based acoustic sensor, the developed MAS demonstrates the greatly 300%-broadened frequency band and the substantially 334.6%-enhanced piezoelectric output. More importantly, this MAS can serve as a high-fidelity auditory platform for music recording and human voice recognition, in which the classification accuracy rate can reach up to 100% in coordination with deep learning. The programmable bionic gradient piezoelectric nanofiber may provide a universal strategy for the development of intelligent bioelectronics.

13.

The impact of low vision on social function: The potential importance of lost visual social cues

Klauke, Susanne; Sondocie, Chloe; Fine, Ione.

J. optom. (Internet) ; 16(1)January - March 2023. ilus

Artigo em Inglês | IBECS | ID: ibc-214425

RESUMO

Visual cues usually play a vital role in social interaction. As well as being the primary cue for identifying other people, visual cues also provide crucial non-verbal social information via both facial expressions and body language. One consequence of vision loss is the need to rely on non-visual cues during social interaction. Although verbal cues can carry a significant amount of information, this information is often not available to an untrained listener. Here, we review the current literature examining potential ways that the loss of social information due to vision loss might impact social functioning. A large number of studies suggest that low vision and blindness is a risk factor for anxiety and depression. This relationship has been attributed to multiple factors, including anxiety about disease progression, and impairments to quality of life that include difficulties reading, and a lack of access to work and social activities. However, our review suggests a potential additional contributing factor to reduced quality of life that has been hitherto overlooked: blindness may make it more difficult to effectively engage in social interactions, due to a loss of visual information. The current literature suggests it might be worth considering training in voice discrimination and/or recognition when carrying out rehabilitative training in late blind individuals. (AU)

Assuntos

Humanos , Ansiedade/psicologia , Cegueira/psicologia , Baixa Visão , Qualidade de Vida/psicologia , Transtornos da Visão/etiologia , Relações Interpessoais

14.

A broad review on non-intrusive active user authentication in biometrics.

Thomas, Princy Ann; Preetha Mathew, K.

J Ambient Intell Humaniz Comput ; 14(1): 339-360, 2023.

Artigo em Inglês | MEDLINE | ID: mdl-34109006

RESUMO

Authentication is the process of keeping the user's personal information as confidential in digital applications. Moreover, the user authentication process in the digital platform is employed to verify the own users by some authentication methods like biometrics, voice recognition, and so on. Traditionally, a one-time login based credential verification method was utilized for user authentication. Recently, several new approaches were proposed to enhance the user authentication framework but those approaches have been found inconsistent during the authentication execution process. Hence, the main motive of this review article is to analyze the advantage and disadvantages of authentication systems such as voice recognition, keystroke, and mouse dynamics. These authentication models are evaluated in a continuous non-user authentication environment and their results have been presented in way of tabular and graphical representation. Also, the common merits and demerits of the discussed authentication systems are broadly explained discussion section. Henceforth, this study will help the researchers to adopt the best suitable method at each stage to build an authentication framework for non-intrusive active authentication.

15.

Speech Recognition System Generates Highly Accurate Endoscopic Reports in Clinical Practice.

Takayama, Hiroshi; Takao, Toshitatsu; Masumura, Ryo; Yamaguchi, Yoshikazu; Yonezawa, Ryo; Sakaguchi, Hiroya; Morita, Yoshinori; Toyonaga, Takashi; Izumiyama, Kazutaka; Kodama, Yuzo.

Intern Med ; 62(2): 153-157, 2023 Jan 15.

Artigo em Inglês | MEDLINE | ID: mdl-35732450

RESUMO

Objective Endoscopic reports are conventionally written at the end of each procedure, and the endoscopist must complete the report from memory. To make endoscopic reporting more efficient, we developed a new speech recognition (SR) system that generates highly accurate endoscopic reports based on structured data entry. We conducted a pilot study to examine the performance of this SR system in an actual endoscopy setting with various types of background noise. Methods In this prospective observational pilot study, participants who underwent upper endoscopy with our SR system were included. The primary outcome was the correct recognition rate of the system. We compared the findings generated by the SR system with the findings in the handwritten report prepared by the endoscopist. The initial correct recognition rate, number of revisions, finding registration time, and endoscopy time were also analyzed. Results Upper endoscopy was performed in 34 patients, generating 128 findings of 22 disease names. The correct recognition rate was 100%, and the median number of revisions was 0. The median finding registration time was 2.57 [interquartile range (IQR), 2.33-2.92] seconds, and the median endoscopy time was 234 (IQR, 194-227) seconds. Conclusion The SR system demonstrated high recognition accuracy in the clinical setting. The finding registration time was extremely short.

Assuntos

Endoscopia Gastrointestinal , Interface para o Reconhecimento da Fala , Humanos , Estudos Prospectivos , Projetos Piloto

16.

The impact of low vision on social function: The potential importance of lost visual social cues.

Klauke, Susanne; Sondocie, Chloe; Fine, Ione.

J Optom ; 16(1): 3-11, 2023.

Artigo em Inglês | MEDLINE | ID: mdl-35568628

RESUMO

Visual cues usually play a vital role in social interaction. As well as being the primary cue for identifying other people, visual cues also provide crucial non-verbal social information via both facial expressions and body language. One consequence of vision loss is the need to rely on non-visual cues during social interaction. Although verbal cues can carry a significant amount of information, this information is often not available to an untrained listener. Here, we review the current literature examining potential ways that the loss of social information due to vision loss might impact social functioning. A large number of studies suggest that low vision and blindness is a risk factor for anxiety and depression. This relationship has been attributed to multiple factors, including anxiety about disease progression, and impairments to quality of life that include difficulties reading, and a lack of access to work and social activities. However, our review suggests a potential additional contributing factor to reduced quality of life that has been hitherto overlooked: blindness may make it more difficult to effectively engage in social interactions, due to a loss of visual information. The current literature suggests it might be worth considering training in voice discrimination and/or recognition when carrying out rehabilitative training in late blind individuals.

Assuntos

Baixa Visão , Humanos , Sinais (Psicologia) , Qualidade de Vida , Cegueira , Ansiedade , Transtornos da Visão

17.

The Jena Voice Learning and Memory Test (JVLMT): A standardized tool for assessing the ability to learn and recognize voices.

Humble, Denise; Schweinberger, Stefan R; Mayer, Axel; Jesgarzewsky, Tim L; Dobel, Christian; Zäske, Romi.

Behav Res Methods ; 55(3): 1352-1371, 2023 04.

Artigo em Inglês | MEDLINE | ID: mdl-35648317

RESUMO

The ability to recognize someone's voice spans a broad spectrum with phonagnosia on the low end and super-recognition at the high end. Yet there is no standardized test to measure an individual's ability of learning and recognizing newly learned voices with samples of speech-like phonetic variability. We have developed the Jena Voice Learning and Memory Test (JVLMT), a 22-min test based on item response theory and applicable across languages. The JVLMT consists of three phases in which participants (1) become familiarized with eight speakers, (2) revise the learned voices, and (3) perform a 3AFC recognition task, using pseudo-sentences devoid of semantic content. Acoustic (dis)similarity analyses were used to create items with various levels of difficulty. Test scores are based on 22 items which had been selected and validated based on two online studies with 232 and 454 participants, respectively. Mean accuracy in the JVLMT is 0.51 (SD = .18) with an empirical (marginal) reliability of 0.66. Correlational analyses showed high and moderate convergent validity with the Bangor Voice Matching Test (BVMT) and Glasgow Voice Memory Test (GVMT), respectively, and high discriminant validity with a digit span test. Four participants with potential super recognition abilities and seven participants with potential phonagnosia were identified who performed at least 2 SDs above or below the mean, respectively. The JVLMT is a promising research and diagnostic screening tool to detect both impairments in voice recognition and super-recognition abilities.

Assuntos

Percepção da Fala , Voz , Humanos , Reprodutibilidade dos Testes , Voz/fisiologia , Fala , Aprendizagem/fisiologia , Reconhecimento Psicológico/fisiologia , Percepção da Fala/fisiologia

18.

A French Version of a Voice Recognition Symbol Digit Modalities Test Analog.

Feinstein, Anthony; Shen, Lingkai; Rose, Jonathan; Cayer, Caroline; Bockus, Caitlyn; Meza, Cecilia; Puopolo, Juliana; Lapointe, Emmanuelle.

Can J Neurol Sci ; 50(6): 925-928, 2023 Nov.

Artigo em Inglês | MEDLINE | ID: mdl-36522663

RESUMO

We previously showed that a fully automated voice recognition analog of the Symbol Digit Modalities Test (VR-SDMT) is sensitive in detecting processing speed deficits in people with multiple sclerosis (pwMS). We subsequently developed a French language version and administered it to 49 French-Canadian pwMS and 29 matched healthy control (HC) subjects. Significant correlations between the VR-SDMT and traditional oral SDMT were found in the MS (r = -0.716, p < 0.001) and HC (r = -0.623, p < 0.001) groups. These findings in French replicate our previous findings and confirm the utility of voice recognition software in assessing cognition in pwMS.

19.

Habilidades de processamento auditivo central e exame pericial de comparação de locutor / Central auditory processing skills and speaker comparison in forensic analysis

Guedes, Ana Carla; Constantini, Ana Carolina; Gielow, Ingrid; Amaral, Maria Isabel Ramos do; Lopes, Leonardo Wanderley.

Audiol., Commun. res ; 28: e2829, 2023. tab

Artigo em Português | LILACS | ID: biblio-1527925

RESUMO

RESUMO Objetivo descrever quais são as habilidades auditivas do processamento auditivo central mais frequentes, relatadas por um grupo de especialistas para a realização do exame de Comparação de Locutor, tradicionalmente realizado por peritos forenses. Métodos estudo prospectivo, descritivo, com análise quantitativa e qualitativa. Os dados foram obtidos por meio de um consenso de especialistas. Participaram da reunião cinco fonoaudiólogos, sendo dois especialistas em audiologia, dois especialistas em voz e uma fonoaudióloga perita. A reunião foi realizada de forma virtual e síncrona, com duração de uma hora e 30 minutos. As tarefas realizadas durante o exame de Comparação de Locutor foram consideradas a partir de um protocolo disponível na literatura. As especialistas em fonoaudiologia receberam explicações a respeito de cada uma das tarefas e foram solicitadas a discutir sobre quais as habilidades do processamento auditivo central estariam envolvidas na execução de cada uma delas. Resultados sete habilidades foram consideradas na reunião dos especialistas como imprescindíveis para as tarefas realizadas no exame de Comparação de Locutor. A ordenação temporal foi a habilidade mais citada, podendo estar presente em seis tarefas, e a tarefa de transcrição do material de fala foi mencionada como sendo a que necessita de mais habilidades do processamento auditivo central. Conclusão Sete habilidades foram consideradas na reunião dos especialistas como imprescindíveis para as tarefas realizadas no exame de Comparação de Locutor. A ordenação temporal foi a habilidade mais citada, podendo estar presente em seis tarefas e a tarefa de transcrição do material de fala foi mencionada como sendo a que necessita de mais habilidades do processamento auditivo central

ABSTRACT Purpose to describe which abilities of central auditory processing are more frequently related for the group of specialists to the performance of the speaker comparison test (CL), traditionally performed by forensic experts. Methods a prospective, descriptive study with quantitative and qualitative analysis and data were obtained through a consensus of experts. Five speech therapists participated in the meeting, two specialists in audiology (EA), two specialists in voice (VS), and an expert speech therapist (FP). The meeting was held virtually and synchronously, lasting 1 hour and 30 minutes. The tasks performed during the Speaker Comparison (LC) exam were considered from a protocol available in the literature. The AEs received explanations about each of the tasks and were asked to discuss which auditory processing skills (ACP) would be involved in the performance of each of them. Results seven PAC skills were considered in the experts' meeting as essential for the tasks performed in the CL exam. Temporal ordering was the most cited skill, being present in six tasks, and the speech material transcription task is the one that requires more skills from the PAC. Conclusion Seven PAC skills were considered in the experts' meeting as essential for the tasks performed in the CL exam. Temporal ordering was the most cited skill, being present in six tasks, and the speech material transcription task is the one that requires more skills from the PAC.

Assuntos

Humanos , Percepção Auditiva , Qualidade da Voz , Fonoaudiologia , Medicina Legal

20.

Cutting-edge communication and learning assistive technologies for disabled children: An artificial intelligence perspective.

Zdravkova, Katerina; Krasniqi, Venera; Dalipi, Fisnik; Ferati, Mexhid.

Front Artif Intell ; 5: 970430, 2022.

Artigo em Inglês | MEDLINE | ID: mdl-36388402

RESUMO

In this study we provide an in-depth review and analysis of the impact of artificial intelligence (AI) components and solutions that support the development of cutting-edge assistive technologies for children with special needs. Various disabilities are addressed and the most recent assistive technologies that enhance communication and education of disabled children, as well as the AI technologies that have enabled their development, are presented. The paper summarizes with an AI perspective on future assistive technologies and ethical concerns arising from the use of such cutting-edge communication and learning technologies for children with disabilities.

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

RESUMO

RESUMO

Assuntos

RESUMO

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA