Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 6 de 6
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
PLOS Digit Health ; 3(4): e0000474, 2024 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-38620047

RESUMO

Despite significant technical advances in machine learning (ML) over the past several years, the tangible impact of this technology in healthcare has been limited. This is due not only to the particular complexities of healthcare, but also due to structural issues in the machine learning for healthcare (MLHC) community which broadly reward technical novelty over tangible, equitable impact. We structure our work as a healthcare-focused echo of the 2012 paper "Machine Learning that Matters", which highlighted such structural issues in the ML community at large, and offered a series of clearly defined "Impact Challenges" to which the field should orient itself. Drawing on the expertise of a diverse and international group of authors, we engage in a narrative review and examine issues in the research background environment, training processes, evaluation metrics, and deployment protocols which act to limit the real-world applicability of MLHC. Broadly, we seek to distinguish between machine learning ON healthcare data and machine learning FOR healthcare-the former of which sees healthcare as merely a source of interesting technical challenges, and the latter of which regards ML as a tool in service of meeting tangible clinical needs. We offer specific recommendations for a series of stakeholders in the field, from ML researchers and clinicians, to the institutions in which they work, and the governments which regulate their data access.

2.
Psychiatr Res Clin Pract ; 5(3): 84-92, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-37711756

RESUMO

Objective: Measurement-based care tools in psychiatry are useful for symptom monitoring and detecting response to treatment, but methods for quick and objective measurement are lacking especially for acute psychosis. The aim of this study was to explore potential language markers, detected by natural language processing (NLP) methods, as a means to objectively measure the severity of psychotic symptoms of schizophrenia in an acute clinical setting. Methods: Twenty-two speech samples were collected from seven participants who were hospitalized for schizophrenia, and their symptoms were evaluated over time with SAPS/SANS and TLC scales. Linguistic features were extracted from the speech data using machine learning techniques. Spearman's correlation was performed to examine the relationship between linguistic features and symptoms. Various machine learning models were evaluated by cross-validation methods for their ability to predict symptom severity using the linguistic markers. Results: Reduced lexical richness and syntactic complexity were characteristic of negative symptoms, while lower content density and more repetitions in speech were predictors of positive symptoms. Machine learning models predicted severity of alogia, illogicality, poverty of speech, social inattentiveness, and TLC scores with up to 82% accuracy. Additionally, speech incoherence was quantifiable through language markers derived from NLP methods. Conclusions: These preliminary findings suggest that NLP may be useful in identifying clinically relevant language markers of schizophrenia, which can enhance objectivity in symptom monitoring during hospitalization. Further work is needed to replicate these findings in a larger data set and explore methods for feasible implementation in practice.

3.
Alzheimers Dement (Amst) ; 15(2): e12445, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-37361261

RESUMO

Speech and language changes occur in Alzheimer's disease (AD), but few studies have characterized their longitudinal course. We analyzed open-ended speech samples from a prodromal-to-mild AD cohort to develop a novel composite score to characterize progressive speech changes. Participant speech from the Clinical Dementia Rating (CDR) interview was analyzed to compute metrics reflecting speech and language characteristics. We determined the aspects of speech and language that exhibited significant longitudinal change over 18 months. Nine acoustic and linguistic measures were combined to create a novel composite score. The speech composite exhibited significant correlations with primary and secondary clinical endpoints and a similar effect size for detecting longitudinal change. Our results demonstrate the feasibility of using automated speech processing to characterize longitudinal change in early AD. Speech-based composite scores could be used to monitor change and detect response to treatment in future research. HIGHLIGHTS: Longitudinal speech samples were analyzed to characterize speech changes in early AD.Acoustic and linguistic measures showed significant change over 18 months.A novel speech composite score was computed to characterize longitudinal change.The speech composite correlated with primary and secondary trial endpoints.Automated speech analysis could facilitate remote, high frequency monitoring in AD.

4.
Sci Adv ; 9(19): eabq0701, 2023 05 10.
Artigo em Inglês | MEDLINE | ID: mdl-37163590

RESUMO

As governments and industry turn to increased use of automated decision systems, it becomes essential to consider how closely such systems can reproduce human judgment. We identify a core potential failure, finding that annotators label objects differently depending on whether they are being asked a factual question or a normative question. This challenges a natural assumption maintained in many standard machine-learning (ML) data acquisition procedures: that there is no difference between predicting the factual classification of an object and an exercise of judgment about whether an object violates a rule premised on those facts. We find that using factual labels to train models intended for normative judgments introduces a notable measurement error. We show that models trained using factual labels yield significantly different judgments than those trained using normative labels and that the impact of this effect on model performance can exceed that of other factors (e.g., dataset size) that routinely attract attention from ML researchers and practitioners.


Assuntos
Julgamento , Aprendizado de Máquina , Humanos , Governo
5.
Commun Med (Lond) ; 2(1): 149, 2022 Nov 21.
Artigo em Inglês | MEDLINE | ID: mdl-36414774

RESUMO

BACKGROUND: Prior research has shown that artificial intelligence (AI) systems often encode biases against minority subgroups. However, little work has focused on ways to mitigate the harm discriminatory algorithms can cause in high-stakes settings such as medicine. METHODS: In this study, we experimentally evaluated the impact biased AI recommendations have on emergency decisions, where participants respond to mental health crises by calling for either medical or police assistance. We recruited 438 clinicians and 516 non-experts to participate in our web-based experiment. We evaluated participant decision-making with and without advice from biased and unbiased AI systems. We also varied the style of the AI advice, framing it either as prescriptive recommendations or descriptive flags. RESULTS: Participant decisions are unbiased without AI advice. However, both clinicians and non-experts are influenced by prescriptive recommendations from a biased algorithm, choosing police help more often in emergencies involving African-American or Muslim men. Crucially, using descriptive flags rather than prescriptive recommendations allows respondents to retain their original, unbiased decision-making. CONCLUSIONS: Our work demonstrates the practical danger of using biased models in health contexts, and suggests that appropriately framing decision support can mitigate the effects of AI bias. These findings must be carefully considered in the many real-world clinical scenarios where inaccurate or biased models may be used to inform important decisions.


Artificial intelligence (AI) systems that make decisions based on historical data are increasingly common in health care settings. However, many AI models exhibit problematic biases, as data often reflect human prejudices against minority groups. In this study, we used a web-based experiment to evaluate the impact biased models can have when used to inform human decisions. We found that though participants were not inherently biased, they were strongly influenced by advice from a biased model if it was offered prescriptively (i.e., "you should do X"). This adherence led their decisions to be biased against African-American and Muslims individuals. However, framing the same advice descriptively (i.e., without recommending a specific action) allowed participants to remain fair. These results demonstrate that though discriminatory AI can lead to poor outcomes for minority groups, appropriately framing advice can help mitigate its effects.

6.
Front Aging Neurosci ; 13: 635945, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-33986655

RESUMO

Introduction: Research related to the automatic detection of Alzheimer's disease (AD) is important, given the high prevalence of AD and the high cost of traditional diagnostic methods. Since AD significantly affects the content and acoustics of spontaneous speech, natural language processing, and machine learning provide promising techniques for reliably detecting AD. There has been a recent proliferation of classification models for AD, but these vary in the datasets used, model types and training and testing paradigms. In this study, we compare and contrast the performance of two common approaches for automatic AD detection from speech on the same, well-matched dataset, to determine the advantages of using domain knowledge vs. pre-trained transfer models. Methods: Audio recordings and corresponding manually-transcribed speech transcripts of a picture description task administered to 156 demographically matched older adults, 78 with Alzheimer's Disease (AD) and 78 cognitively intact (healthy) were classified using machine learning and natural language processing as "AD" or "non-AD." The audio was acoustically-enhanced, and post-processed to improve quality of the speech recording as well control for variation caused by recording conditions. Two approaches were used for classification of these speech samples: (1) using domain knowledge: extracting an extensive set of clinically relevant linguistic and acoustic features derived from speech and transcripts based on prior literature, and (2) using transfer-learning and leveraging large pre-trained machine learning models: using transcript-representations that are automatically derived from state-of-the-art pre-trained language models, by fine-tuning Bidirectional Encoder Representations from Transformer (BERT)-based sequence classification models. Results: We compared the utility of speech transcript representations obtained from recent natural language processing models (i.e., BERT) to more clinically-interpretable language feature-based methods. Both the feature-based approaches and fine-tuned BERT models significantly outperformed the baseline linguistic model using a small set of linguistic features, demonstrating the importance of extensive linguistic information for detecting cognitive impairments relating to AD. We observed that fine-tuned BERT models numerically outperformed feature-based approaches on the AD detection task, but the difference was not statistically significant. Our main contribution is the observation that when tested on the same, demographically balanced dataset and tested on independent, unseen data, both domain knowledge and pretrained linguistic models have good predictive performance for detecting AD based on speech. It is notable that linguistic information alone is capable of achieving comparable, and even numerically better, performance than models including both acoustic and linguistic features here. We also try to shed light on the inner workings of the more black-box natural language processing model by performing an interpretability analysis, and find that attention weights reveal interesting patterns such as higher attribution to more important information content units in the picture description task, as well as pauses and filler words. Conclusion: This approach supports the value of well-performing machine learning and linguistically-focussed processing techniques to detect AD from speech and highlights the need to compare model performance on carefully balanced datasets, using consistent same training parameters and independent test datasets in order to determine the best performing predictive model.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...