Search | VHL Regional Portal

1.

Why study the discourses that shape professional training in nursing? / ¿Por qué estudiar los discursos que configuran la formación profesional en enfermería? / Por que estudar os discursos que conformam a formação profissional em enfermagem?

Macías, María Liliana Calderón.

SciELO Preprints; ago. 2024.

Preprint in Spanish | SciELO Preprints | ID: pps-9589

ABSTRACT

The configuration of nursing knowledge cannot be analyzed without knowing the contextual reality and the social recognition that nursing goes through at each moment in history. Thus, this societal knowledge also called discourse is fundamental to identify the role that the nursing community has in the production and reproduction of this knowledge. In this context, the following reflection is presented on the importance of studying the discourses that shape professional training in nursing, in order to understand the development that this profession has had as a profession and discipline in the different contexts where it is practiced.

No se puede analizar la configuración del saber enfermero sin conocer la realidad contextual y el reconocimiento social por el que atraviesa a la enfermería en cada momento de la historia. Siendo así, ese conocimiento societal también llamado discurso es fundamental para identificar el protagonismo que tiene la comunidad de enfermería en la producción y reproducción de dicho saber. En ese contexto, se presenta la siguiente reflexión sobre la importancia de estudiar los discursos que configuran la formación profesional en enfermería, para comprender el desarrollo que como profesión y disciplina ha tenido esta profesión en los diferentes contextos donde se practica.

2.

The characteristics and reproducibility of motor speech functional neuroimaging in healthy controls.

Kenyon, Katherine H; Boonstra, Frederique; Noffs, Gustavo; Morgan, Angela T; Vogel, Adam P; Kolbe, Scott; Van Der Walt, Anneke.

Front Hum Neurosci ; 18: 1382102, 2024.

Article in English | MEDLINE | ID: mdl-39171097

ABSTRACT

Introduction: Functional magnetic resonance imaging (fMRI) can improve our understanding of neural processes subserving motor speech function. Yet its reproducibility remains unclear. This study aimed to evaluate the reproducibility of fMRI using a word repetition task across two time points. Methods: Imaging data from 14 healthy controls were analysed using a multi-level general linear model. Results: Significant activation was observed during the task in the right hemispheric cerebellar lobules IV-V, right putamen, and bilateral sensorimotor cortices. Activation between timepoints was found to be moderately reproducible across time in the cerebellum but not in other brain regions. Discussion: Preliminary findings highlight the involvement of the cerebellum and connected cerebral regions during a motor speech task. More work is needed to determine the degree of reproducibility of speech fMRI before this could be used as a reliable marker of changes in brain activity.

3.

Editorial: The ethics of speech ownership in the context of neural control of augmented assistive communication.

Freudenburg, Zachary; Berezutskaya, Julia; Herbert, Cornelia.

Front Hum Neurosci ; 18: 1468938, 2024.

Article in English | MEDLINE | ID: mdl-39171098

4.

Estimating the severity of obstructive sleep apnea during wakefulness using speech: A review.

TaghiBeyglou, Behrad; Culjak, Ivana; Bagheri, Fatemeh; Suntharalingam, Haarini; Yadollahi, Azadeh.

Comput Biol Med ; 181: 109020, 2024 Aug 21.

Article in English | MEDLINE | ID: mdl-39173487

ABSTRACT

Obstructive sleep apnea (OSA) is a chronic breathing disorder during sleep that affects 10-30% of adults in North America. The gold standard for diagnosing OSA is polysomnography (PSG). However, PSG has several drawbacks, for example, it is a cumbersome and expensive procedure, which can be quite inconvenient for patients. Additionally, patients often have to endure long waitlists before they can undergo PSG. As a result, other alternatives for screening OSA have gained attention. Speech, as an accessible modality, is generated by variations in the pharyngeal airway, vocal tract, and soft tissues in the pharynx, which shares similar anatomical structures that contribute to OSA. Consequently, in this study, we aim to provide a comprehensive review of the existing research on the use of speech for estimating the severity of OSA. In this regard, a total of 851 papers were initially identified from the PubMed database using a specified set of keywords defined by population, intervention, comparison and outcome (PICO) criteria, along with a concatenated graph of the 5 most cited papers in the field extracted from ConnectedPapers platform. Following a rigorous filtering process that considered the preferred reporting items for systematic reviews and meta-analyses (PRISMA) approach, 32 papers were ultimately included in this review. Among these, 28 papers primarily focused on developing methodology, while the remaining 4 papers delved into the clinical perspective of the association between OSA and speech. In the next step, we investigate the physiological similarities between OSA and speech. Subsequently, we highlight the features extracted from speech, the employed feature selection techniques, and the details of the developed models to predict OSA severity. By thoroughly discussing the current findings and limitations of studies in the field, we provide valuable insights into the gaps that need to be addressed in future research directions.

5.

Iterative alignment discovery of speech-associated neural activity.

Rabbani, Qinwan; Shah, Samyak; Milsap, Griffin; Fifer, Matthew; Hermansky, Hynek; Crone, Nathan.

J Neural Eng ; 21(4)2024 Aug 28.

Article in English | MEDLINE | ID: mdl-39194182

ABSTRACT

Objective. Brain-computer interfaces (BCIs) have the potential to preserve or restore speech in patients with neurological disorders that weaken the muscles involved in speech production. However, successful training of low-latency speech synthesis and recognition models requires alignment of neural activity with intended phonetic or acoustic output with high temporal precision. This is particularly challenging in patients who cannot produce audible speech, as ground truth with which to pinpoint neural activity synchronized with speech is not available.Approach. In this study, we present a new iterative algorithm for neural voice activity detection (nVAD) called iterative alignment discovery dynamic time warping (IAD-DTW) that integrates DTW into the loss function of a deep neural network (DNN). The algorithm is designed to discover the alignment between a patient's electrocorticographic (ECoG) neural responses and their attempts to speak during collection of data for training BCI decoders for speech synthesis and recognition.Main results. To demonstrate the effectiveness of the algorithm, we tested its accuracy in predicting the onset and duration of acoustic signals produced by able-bodied patients with intact speech undergoing short-term diagnostic ECoG recordings for epilepsy surgery. We simulated a lack of ground truth by randomly perturbing the temporal correspondence between neural activity and an initial single estimate for all speech onsets and durations. We examined the model's ability to overcome these perturbations to estimate ground truth. IAD-DTW showed no notable degradation (<1% absolute decrease in accuracy) in performance in these simulations, even in the case of maximal misalignments between speech and silence.Significance. IAD-DTW is computationally inexpensive and can be easily integrated into existing DNN-based nVAD approaches, as it pertains only to the final loss computation. This approach makes it possible to train speech BCI algorithms using ECoG data from patients who are unable to produce audible speech, including those with Locked-In Syndrome.

Subject(s)

Algorithms , Brain-Computer Interfaces , Electrocorticography , Speech , Humans , Speech/physiology , Electrocorticography/methods , Male , Female , Adult , Neural Networks, Computer

6.

Multilevel hybrid handcrafted feature extraction based depression recognition method using speech.

Tasci, Burak.

J Affect Disord ; 364: 9-19, 2024 Nov 01.

Article in English | MEDLINE | ID: mdl-39127304

ABSTRACT

BACKGROUND AND PURPOSE: Diagnosis of depression is based on tests performed by psychiatrists and information provided by patients or their relatives. In the field of machine learning (ML), numerous models have been devised to detect depression automatically through the analysis of speech audio signals. While deep learning approaches often achieve superior classification accuracy, they are notably resource-intensive. This research introduces an innovative, multilevel hybrid feature extraction-based classification model, specifically designed for depression detection, which exhibits reduced time complexity. MATERIALS AND METHODS: MODMA dataset consisting of 29 healthy and 23 Major depressive disorder audio signals was used. The constructed model architecture integrates multilevel hybrid feature extraction, iterative feature selection, and classification processes. During the Hybrid Handcrafted Feature (HHF) generation stage, a combination of textural and statistical methods was employed to extract low-level features from speech audio signals. To enhance this process for high-level feature creation, a Multilevel Discrete Wavelet Transform (MDWT) was applied. This technique produced wavelet subbands, which were then input into the hybrid feature extractor, enabling the extraction of both high and low-level features. For the selection of the most pertinent features from these extracted vectors, Iterative Neighborhood Component Analysis (INCA) was utilized. Finally, in the classification phase, a one-dimensional nearest neighbor classifier, augmented with ten-fold cross-validation, was implemented to achieve detailed, results. RESULTS: The HHF-based speech audio signal classification model attained excellent performance, with the 94.63 % classification accuracy. CONCLUSIONS: The findings validate the remarkable proficiency of the introduced HHF-based model in depression classification, underscoring its computational efficiency.

Subject(s)

Depressive Disorder, Major , Machine Learning , Humans , Depressive Disorder, Major/diagnosis , Depressive Disorder, Major/classification , Speech , Wavelet Analysis , Adult , Female , Deep Learning , Male

7.

Level-Dependent Subcortical Electroencephalography Responses to Continuous Speech.

Kulasingham, Joshua P; Innes-Brown, Hamish; Enqvist, Martin; Alickovic, Emina.

eNeuro ; 11(8)2024 Aug.

Article in English | MEDLINE | ID: mdl-39142822

ABSTRACT

The auditory brainstem response (ABR) is a measure of subcortical activity in response to auditory stimuli. The wave V peak of the ABR depends on the stimulus intensity level, and has been widely used for clinical hearing assessment. Conventional methods estimate the ABR average electroencephalography (EEG) responses to short unnatural stimuli such as clicks. Recent work has moved toward more ecologically relevant continuous speech stimuli using linear deconvolution models called temporal response functions (TRFs). Investigating whether the TRF waveform changes with stimulus intensity is a crucial step toward the use of natural speech stimuli for hearing assessments involving subcortical responses. Here, we develop methods to estimate level-dependent subcortical TRFs using EEG data collected from 21 participants listening to continuous speech presented at 4 different intensity levels. We find that level-dependent changes can be detected in the wave V peak of the subcortical TRF for almost all participants, and are consistent with level-dependent changes in click-ABR wave V. We also investigate the most suitable peripheral auditory model to generate predictors for level-dependent subcortical TRFs and find that simple gammatone filterbanks perform the best. Additionally, around 6 min of data may be sufficient for detecting level-dependent effects and wave V peaks above the noise floor for speech segments with higher intensity. Finally, we show a proof-of-concept that level-dependent subcortical TRFs can be detected even for the inherent intensity fluctuations in natural continuous speech.

Subject(s)

Electroencephalography , Evoked Potentials, Auditory, Brain Stem , Speech Perception , Humans , Electroencephalography/methods , Male , Female , Evoked Potentials, Auditory, Brain Stem/physiology , Adult , Young Adult , Speech Perception/physiology , Acoustic Stimulation/methods , Speech/physiology

8.

Dissociating Cerebellar Regions Involved in Formulating and Articulating Words and Sentences.

Parker Jones, Oiwi; Geva, Sharon; Prejawa, Susan; Hope, Thomas M H; Oberhuber, Marion; Seghier, Mohamed L; Green, David W; Price, Cathy J.

Neurobiol Lang (Camb) ; 5(3): 795-817, 2024.

Article in English | MEDLINE | ID: mdl-39175783

ABSTRACT

We investigated which parts of the cerebellum are involved in formulating and articulating sentences using (i) a sentence production task that involved describing simple events in pictures; (ii) an auditory sentence repetition task involving the same sentence articulation but not sentence formulation; and (iii) an auditory sentence-to-picture matching task that involved the same pictorial events and no overt articulation. Activation for each of these tasks was compared to the equivalent word processing tasks: noun production, verb production, auditory noun repetition, and auditory noun-to-picture matching. We associate activation in bilateral cerebellum lobule VIIb with sequencing words into sentences because it increased for sentence production compared to all other conditions and was also activated by word production compared to word matching. We associate a paravermal part of right cerebellar lobule VIIIb with overt motor execution of speech, because activation was higher during (i) production and repetition of sentences compared to the corresponding noun conditions and (ii) noun and verb production compared to all matching tasks, with no activation relative to fixation during any silent (nonspeaking) matching task. We associate activation within right cerebellar Crus II with covert articulatory activity because it activated for (i) all speech production more than matching tasks and (ii) sentences compared to nouns during silent (nonspeaking) matching as well as sentence production and sentence repetition. Our study serendipitously segregated, for the first time, three distinct functional roles for the cerebellum in generic speech production, and it demonstrated how sentence production enhanced the demands on these cerebellar regions.

9.

The Cerebellum Is Sensitive to the Lexical Properties of Words During Spoken Language Comprehension.

Mechtenberg, Hannah; Heffner, Christopher C; Myers, Emily B; Guediche, Sara.

Neurobiol Lang (Camb) ; 5(3): 757-773, 2024.

Article in English | MEDLINE | ID: mdl-39175786

ABSTRACT

Over the past few decades, research into the function of the cerebellum has expanded far beyond the motor domain. A growing number of studies are probing the role of specific cerebellar subregions, such as Crus I and Crus II, in higher-order cognitive functions including receptive language processing. In the current fMRI study, we show evidence for the cerebellum's sensitivity to variation in two well-studied psycholinguistic properties of words-lexical frequency and phonological neighborhood density-during passive, continuous listening of a podcast. To determine whether, and how, activity in the cerebellum correlates with these lexical properties, we modeled each word separately using an amplitude-modulated regressor, time-locked to the onset of each word. At the group level, significant effects of both lexical properties landed in expected cerebellar subregions: Crus I and Crus II. The BOLD signal correlated with variation in each lexical property, consistent with both language-specific and domain-general mechanisms. Activation patterns at the individual level also showed that effects of phonological neighborhood and lexical frequency landed in Crus I and Crus II as the most probable sites, though there was activation seen in other lobules (especially for frequency). Although the exact cerebellar mechanisms used during speech and language processing are not yet evident, these findings highlight the cerebellum's role in word-level processing during continuous listening.

10.

Neural Decoding of the Speech Envelope: Effects of Intelligibility and Spectral Degradation.

MacIntyre, Alexis Deighton; Carlyon, Robert P; Goehring, Tobias.

Trends Hear ; 28: 23312165241266316, 2024.

Article in English | MEDLINE | ID: mdl-39183533

ABSTRACT

During continuous speech perception, endogenous neural activity becomes time-locked to acoustic stimulus features, such as the speech amplitude envelope. This speech-brain coupling can be decoded using non-invasive brain imaging techniques, including electroencephalography (EEG). Neural decoding may provide clinical use as an objective measure of stimulus encoding by the brain-for example during cochlear implant listening, wherein the speech signal is severely spectrally degraded. Yet, interplay between acoustic and linguistic factors may lead to top-down modulation of perception, thereby complicating audiological applications. To address this ambiguity, we assess neural decoding of the speech envelope under spectral degradation with EEG in acoustically hearing listeners (n = 38; 18-35 years old) using vocoded speech. We dissociate sensory encoding from higher-order processing by employing intelligible (English) and non-intelligible (Dutch) stimuli, with auditory attention sustained using a repeated-phrase detection task. Subject-specific and group decoders were trained to reconstruct the speech envelope from held-out EEG data, with decoder significance determined via random permutation testing. Whereas speech envelope reconstruction did not vary by spectral resolution, intelligible speech was associated with better decoding accuracy in general. Results were similar across subject-specific and group analyses, with less consistent effects of spectral degradation in group decoding. Permutation tests revealed possible differences in decoder statistical significance by experimental condition. In general, while robust neural decoding was observed at the individual and group level, variability within participants would most likely prevent the clinical use of such a measure to differentiate levels of spectral degradation and intelligibility on an individual basis.

Subject(s)

Acoustic Stimulation , Electroencephalography , Speech Intelligibility , Speech Perception , Humans , Speech Perception/physiology , Female , Male , Adolescent , Adult , Young Adult , Speech Acoustics , Brain/physiology

11.

Assessing the Effectiveness of Automatic Speech Recognition Technology in Emergency Medicine Settings: A Comparative Study of Four AI-powered Engines.

Luo, Xiao; Zhou, Le; Adelgais, Kathleen; Zhang, Zhan.

Res Sq ; 2024 Aug 17.

Article in English | MEDLINE | ID: mdl-39184074

ABSTRACT

Purpose: Cutting-edge automatic speech recognition (ASR) technology holds significant promise in transcribing and recognizing medical information during patient encounters, thereby enabling automatic and real-time clinical documentation, which could significantly alleviate care clinicians' burdens. Nevertheless, the performance of current-generation ASR technology in analyzing conversations in noisy and dynamic medical settings, such as prehospital or Emergency Medical Services (EMS), lacks sufficient validation. This study explores the current technological limitations and future potential of deploying ASR technology for clinical documentation in fastpaced and noisy medical settings such as EMS. Methods: In this study, we evaluated four ASR engines, including Google Speech-to-Text Clinical Conversation, OpenAI Speech-to-Text, Amazon Transcribe Medical, and Azure Speech-to-Text engine. The empirical data used for evaluation were 40 EMS simulation recordings. The transcribed texts were analyzed for accuracy against 23 Electronic Health Records (EHR) categories of EMS. The common types of errors in transcription were also analyzed. Results: Among all four ASR engines, Google Speech-to-Text Clinical Conversation performed the best. Among all EHR categories, better performance was observed in categories "mental state" (F1 = 1.0), "allergies" (F1 = 0.917), "past medical history" (F1 = 0.804), "electrolytes" (F1 = 1.0), and "blood glucose level" (F1 = 0.813). However, all four ASR engines demonstrated low performance in transcribing certain critical categories, such as "treatment" (F1 = 0.650) and "medication" (F1 = 0.577). Conclusion: Current ASR solutions fall short in fully automating the clinical documentation in EMS setting. Our findings highlight the need for further improvement and development of automated clinical documentation technology to improve recognition accuracy in time-critical and dynamic medical settings.

12.

Automatic Identification of Hate Speech - A Case-Study of alt-Right YouTube Videos.

Eddebo, Johan; Hietanen, Mika; Johansson, Mathias.

F1000Res ; 13: 328, 2024.

Article in English | MEDLINE | ID: mdl-39131834

ABSTRACT

Background: Identifying hate speech (HS) is a central concern within online contexts. Current methods are insufficient for efficient preemptive HS identification. In this study, we present the results of an analysis of automatic HS identification applied to popular alt-right YouTube videos. Methods: This essay describes methodological challenges of automatic HS detection. The case study concerns data on a formative segment of contemporary radical right discourse. Our purpose is twofold. (1) To outline an interdisciplinary mixed-methods approach for using automated identification of HS. This bridges the gap between technical research on the one hand (such as machine learning, deep learning, and natural language processing, NLP) and traditional empirical research on the other. Regarding alt-right discourse and HS, we ask: (2) What are the challenges in identifying HS in popular alt-right YouTube videos? Results: The results indicate that effective and consistent identification of HS communication necessitates qualitative interventions to avoid arbitrary or misleading applications. Binary approaches of hate/non-hate speech tend to force the rationale for designating content as HS. A context-sensitive qualitative approach can remedy this by bringing into focus the indirect character of these communications. The results should interest researchers within social sciences and the humanities adopting automatic sentiment analysis and for those analysing HS and radical right discourse. Conclusions: Automatic identification or moderation of HS cannot account for an evolving context of indirect signification. This study exemplifies a process whereby automatic hate speech identification could be utilised effectively. Several methodological steps are needed for a useful outcome, with both technical quantitative processing and qualitative analysis being vital to achieve meaningful results. With regard to the alt-right YouTube material, the main challenge is indirect framing. Identification demands orientation in the broader discursive context and the adaptation towards indirect expressions renders moderation and suppression ethically and legally precarious.

Subject(s)

Social Media , Video Recording , Humans , Speech , Hate , Natural Language Processing

13.

Dopa-responsive Rest Tremor Preceding Tachyphemia in Progressive Supranuclear Palsy: A Case Report.

Shiga, Takuro; Ishiyama, Shun; Sugeno, Naoto; Nozue, Kei; Kakinuma, Kazuo; Aoki, Masashi.

Intern Med ; 2024 Aug 10.

Article in English | MEDLINE | ID: mdl-39135248

ABSTRACT

Progressive supranuclear palsy (PSP) is characterized by progressive postural instability, falls, and supranuclear vertical gaze abnormalities. In this report, we present the case of a 71-year-old woman with dopa-responsive rest tremor followed by tachyphemia and postural instability. She initially presented with dopa-responsive slowness and tremor in the right hand. Two years later, she developed speech difficulties (tachyphemia) and a propensity for falls. Based on the diagnostic criteria for PSP, the patient was diagnosed with probable PSP-RS. The clinical manifestations observed in our patient are unique and are considered important for illustrating a broad spectrum of PSP syndrome.

14.

Maxillectomy patients' speech and performance of contemporary speaker-independent automatic speech recognition platforms in Japanese.

Ali, Ahmed Sameir Mohamed; Masaki, Keita; Hattori, Mariko; Sumita, Yuka I; Wakabayashi, Noriyuki.

J Oral Rehabil ; 2024 Aug 12.

Article in English | MEDLINE | ID: mdl-39135293

ABSTRACT

BACKGROUND: Automatic speech recognition (ASR) can potentially help older adults and people with disabilities reduce their dependence on others and increase their participation in society. However, maxillectomy patients with reduced speech intelligibility may encounter some problems using such technologies. OBJECTIVES: To investigate the accuracy of three commonly used ASR platforms when used by Japanese maxillectomy patients with and without their obturator placed. METHODS: Speech samples were obtained from 29 maxillectomy patients with and without their obturator and 17 healthy volunteers. The samples were input into three speaker-independent speech recognition platforms and the transcribed text was compared with the original text to calculate the syllable error rate (SER). All participants also completed a conventional speech intelligibility test to grade their speech using Taguchi's method. A comprehensive articulation assessment of patients without their obturator was also performed. RESULTS: Significant differences in SER were observed between healthy and maxillectomy groups. Maxillectomy patients with an obturator showed a significant negative correlation between speech intelligibility scores and SER. However, for those without an obturator, no significant correlations were observed. Furthermore, for maxillectomy patients without an obturator, significant differences were found between syllables grouped by vowels. Syllables containing /i/, /u/ and /e/ exhibited higher error rates compared to those containing /a/ and /o/. Additionally, significant differences were observed when syllables were grouped by consonant place of articulation and manner of articulation. CONCLUSION: The three platforms performed well for healthy volunteers and maxillectomy patients with their obturator, but the SER for maxillectomy patients without their obturator was high, rendering the platforms unusable. System improvement is needed to increase accuracy for maxillectomy patients.

15.

Association between caffeine intake from foods and beverages in the diet and hearing loss in United States adults.

Xia, Fei; Ren, Yuanyuan.

Front Neurol ; 15: 1436238, 2024.

Article in English | MEDLINE | ID: mdl-39114534

ABSTRACT

Background: Hearing loss (HL) is the third most prevalent condition, significantly affecting individuals and society. Recent research has explored the potential impact of nutrition, particularly caffeine intake, on HL. While some studies focus on coffee, caffeine intake should be assessed across all dietary sources. This study examines the association between dietary caffeine intake and HL. Methods: Our cross-sectional study included 6,082 participants from the National Health and Nutrition Examination Survey (NHANES). Participants were divided into two groups based on their median caffeine intake: low and high. The study investigated two types of HL: speech-frequency hearing loss (SFHL) and high-frequency hearing loss (HFHL). Binary logistic regression analyzed the correlation between caffeine intake and HL, and a restricted cubic spline (RCS) model assessed potential non-linear associations. Subgroup analyses were also conducted. Results: High caffeine intake was associated with significantly higher rates of SFHL and HFHL compared to low intake (SFHL: 15.4% vs. 10%, HFHL: 30.5% vs. 20.6%, both p < 0.001). Unadjusted logistic regression showed a higher likelihood of SFHL (OR[95%CI] = 1.65[1.41-1.92]) and HFHL (OR[95%CI] = 1.69[1.50-1.90]) in high caffeine consumers. After adjusting for confounders, high caffeine intake remained significantly associated with SFHL (OR[95%CI] = 1.35[1.09-1.66]) but not HFHL (OR[95%CI] = 1.14[0.96-1.35]). The RCS model indicated a linear increase in the risk of SFHL and HFHL with higher caffeine intake (non-linear p = 0.229 for SFHL, p = 0.894 for HFHL). Subgroup analysis revealed that increased caffeine intake was linked to higher SFHL and HFHL risks in participants under 65 years but not in those 65 years and older (SFHL: p for interaction = 0.002; HFHL: p for interaction <0.001). Conclusion: Our study indicates a strong correlation between dietary caffeine intake and the risk of HL in American adults, particularly those under 65. High caffeine intake was linked to an increased risk of SFHL, but not HFHL, after adjusting for relevant variables.

16.

Acoustic features from speech as markers of depressive and manic symptoms in bipolar disorder: A prospective study.

Kaczmarek-Majer, Katarzyna; Dominiak, Monika; Antosik, Anna Z; Hryniewicz, Olgierd; Kaminska, Olga; Opara, Karol; Owsinski, Jan; Radziszewska, Weronika; Sochacka, Malgorzata; Swiecicki, Lukasz.

Acta Psychiatr Scand ; 2024 Aug 08.

Article in English | MEDLINE | ID: mdl-39118422

ABSTRACT

INTRODUCTION: Voice features could be a sensitive marker of affective state in bipolar disorder (BD). Smartphone apps offer an excellent opportunity to collect voice data in the natural setting and become a useful tool in phase prediction in BD. AIMS OF THE STUDY: We investigate the relations between the symptoms of BD, evaluated by psychiatrists, and patients' voice characteristics. A smartphone app extracted acoustic parameters from the daily phone calls of n = 51 patients. We show how the prosodic, spectral, and voice quality features correlate with clinically assessed affective states and explore their usefulness in predicting the BD phase. METHODS: A smartphone app (BDmon) was developed to collect the voice signal and extract its physical features. BD patients used the application on average for 208 days. Psychiatrists assessed the severity of BD symptoms using the Hamilton depression rating scale -17 and the Young Mania rating scale. We analyze the relations between acoustic features of speech and patients' mental states using linear generalized mixed-effect models. RESULTS: The prosodic, spectral, and voice quality parameters, are valid markers in assessing the severity of manic and depressive symptoms. The accuracy of the predictive generalized mixed-effect model is 70.9%-71.4%. Significant differences in the effect sizes and directions are observed between female and male subgroups. The greater the severity of mania in males, the louder (ß = 1.6) and higher the tone of voice (ß = 0.71), more clearly (ß = 1.35), and more sharply they speak (ß = 0.95), and their conversations are longer (ß = 1.64). For females, the observations are either exactly the opposite-the greater the severity of mania, the quieter (ß = -0.27) and lower the tone of voice (ß = -0.21) and less clearly (ß = -0.25) they speak - or no correlations are found (length of speech). On the other hand, the greater the severity of bipolar depression in males, the quieter (ß = -1.07) and less clearly they speak (ß = -1.00). In females, no distinct correlations between the severity of depressive symptoms and the change in voice parameters are found. CONCLUSIONS: Speech analysis provides physiological markers of affective symptoms in BD and acoustic features extracted from speech are effective in predicting BD phases. This could personalize monitoring and care for BD patients, helping to decide whether a specialist should be consulted.

17.

Speech Quality Perception in Unilateral Cochlear Implant Users With Single-Sided Deafness.

Kelly, Scott; Kuhlmey, Megan E; Despotidis, Meghan A; Alter, Isaac L; Hwa, Tiffany P; Chern, Alexander; Lalwani, Anil K.

Otolaryngol Head Neck Surg ; 2024 Aug 09.

Article in English | MEDLINE | ID: mdl-39118494

ABSTRACT

OBJECTIVE: Cochlear implant (CI) users frequently complain about speech quality perception (SQP). In patients undergoing cochlear implantation for single-sided deafness, there is concern that poor SQP from the implanted ear will negatively impact binaural (CI + normal hearing [NH]) SQP. In this study, we investigate if binaural SQP is measurably different than unimplanted NH alone. STUDY DESIGN: Cross-sectional study. SETTING: Tertiary care center. METHODS: Fifteen unilateral CI users with NH in the contralateral ear completed the validated Columbia Speech Quality Instrument. This instrument consists of 9 audio clips rated across 14 specific speech qualities using a 10-point visual analog scale. SQP was assessed in 3 conditions: CI only, NH only, and CI + NH. RESULTS: Median speech quality scores were worse in the CI only condition compared to the NH only (50.0 vs 72.6, P = .0003) and binaural (50.0 vs 71.0, P = .007) conditions. Median speech quality scores were not significantly different between the NH only and binaural conditions (72.6 vs 71, P = .8). Compared to NH, CI speech quality sounded less clear, less natural, and more mechanical. CONCLUSION: Compared to NH, SQP is poorer with a CI alone. However, in contrast to expectation, there is no significant difference between NH and binaural SQP. This suggests poorer CI speech perception does not negatively impact binaural SQP in patients undergoing cochlear implantation for single-sided deafness.

18.

Speech and language patterns in autism: Towards natural language processing as a research and clinical tool.

Trayvick, Jadyn; Barkley, Sarah B; McGowan, Alessia; Srivastava, Agrima; Peters, Arabella W; Cecchi, Guillermo A; Foss-Feig, Jennifer H; Corcoran, Cheryl M.

Psychiatry Res ; 340: 116109, 2024 Jul 30.

Article in English | MEDLINE | ID: mdl-39106814

ABSTRACT

Speech and language differences have long been described as important characteristics of autism spectrum disorder (ASD). Linguistic abnormalities range from prosodic differences in pitch, intensity, and rate of speech, to language idiosyncrasies and difficulties with pragmatics and reciprocal conversation. Heterogeneity of findings and a reliance on qualitative, subjective ratings, however, limit a full understanding of linguistic phenotypes in autism. This review summarizes evidence of both speech and language differences in ASD. We also describe recent advances in linguistic research, aided by automated methods and software like natural language processing (NLP) and speech analytic software. Such approaches allow for objective, quantitative measurement of speech and language patterns that may be more tractable and unbiased. Future research integrating both speech and language features and capturing "natural language" samples may yield a more comprehensive understanding of language differences in autism, offering potential implications for diagnosis, intervention, and research.

19.

Alaryngeal Speech Enhancement for Noisy Environments Using a Pareto Denoising Gated LSTM.

Maskeliunas, Rytis; Damasevicius, Robertas; Kulikajevas, Audrius; Pribuisis, Kipras; Uloza, Virgilijus.

J Voice ; 2024 Aug 05.

Article in English | MEDLINE | ID: mdl-39107213

ABSTRACT

Loss of the larynx significantly alters natural voice production, requiring alternative communication modalities and rehabilitation methods to restore speech intelligibility and improve the quality of life of affected individuals. This paper explores advances in alaryngeal speech enhancement to improve signal quality and reduce background noise, focusing on individuals who have undergone laryngectomy. In this study, speech samples were obtained from 23 Lithuanian males who had undergone laryngectomy with secondary implantation of the tracheoesophageal prosthesis (TEP). Pareto-optimized gated long short-term memory was trained on tracheoesophageal speech data to recognize complex temporal connections and contextual information in speech signals. The system was able to distinguish between actual speech and various forms of noise and artifacts, resulting in a 25% drop in the mean signal-to-noise ratio compared to other approaches. According to acoustic analysis, the system significantly decreased the number of unvoiced frames (proportion of voiced frames) from 40% to 10% while maintaining stable proportions of voiced frames (proportion of voiced speech frames) and average voicing evidence (average voice evidence in voiced frames), indicating the accuracy of the approach in selectively attenuating noise and undesired speech artifacts while preserving important speech information.

20.

Telesimulation for Training in Infant Feeding: A Randomized Controlled Trial.

Marshall, Jeanne; Shiu, Charis; Raatz, Madeline; Penman, Adriana; Beak, Kelly; Clarke, Sally; Ward, Elizabeth C.

Dysphagia ; 2024 Aug 12.

Article in English | MEDLINE | ID: mdl-39133239

ABSTRACT

Simulation is an education modality known to support clinical skill development. Unfortunately, access to simulation has been challenging, both prior to and during the pandemic. Simulation via telepractice, i.e., "telesimulation", has emerged, but little is known about whether outcomes are comparable to in-person simulation. This study compared in-person versus telesimulation learner outcomes in an infant feeding scenario. The secondary aim was to compare outcomes between novice and experienced participants.This pragmatic randomized controlled trial included speech pathologists who could attend if randomized to the in-person modality. Block randomization matched participants with < 6 months' infant feeding experience to those with > 6 months experience (2:1 ratio) into telesimulation or in-person simulation. Measures of clinical reasoning, confidence/anxiety, and satisfaction were collected, pre-, post-, and 4-weeks post-simulation.Overall, 39 clinicians completed either in-person simulation (n = 17) or telesimulation training (n = 22), including 16 experienced and 23 novice learners. Both in-person and telesimulation groups achieved significant improvements across time in clinical reasoning, self-reported confidence, and anxiety. The extent of change in clinical reasoning, confidence and anxiety was comparable between the telesimulation and in-person simulation groups. Comparing by experience, novice-level participants reported significantly greater changes in confidence and anxiety than experienced participants. Satisfaction levels were high regardless of simulation modality or experience.Participants in telesimulation and in-person simulation achieved similar improvements in the primary outcome measure of clinical reasoning, had comparable improvements in self-perceived confidence and anxiety, and demonstrated high satisfaction levels. Telesimulation is a promising means to improve clinician access to simulation training in infant feeding.

ABSTRACT

ABSTRACT

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

ABSTRACT

ABSTRACT

ABSTRACT

ABSTRACT

ABSTRACT

ABSTRACT

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL