Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 29
Filter
Add more filters










Publication year range
1.
Nat Commun ; 15(1): 1808, 2024 Feb 28.
Article in English | MEDLINE | ID: mdl-38418453

ABSTRACT

A clinical artificial intelligence (AI) system is often validated on data withheld during its development. This provides an estimate of its performance upon future deployment on data in the wild; those currently unseen but are expected to be encountered in a clinical setting. However, estimating performance on data in the wild is complicated by distribution shift between data in the wild and withheld data and the absence of ground-truth annotations. Here, we introduce SUDO, a framework for evaluating AI systems on data in the wild. Through experiments on AI systems developed for dermatology images, histopathology patches, and clinical notes, we show that SUDO can identify unreliable predictions, inform the selection of models, and allow for the previously out-of-reach assessment of algorithmic bias for data in the wild without ground-truth annotations. These capabilities can contribute to the deployment of trustworthy and ethical AI systems in medicine.


Subject(s)
Artificial Intelligence , Medicine
2.
JAMIA Open ; 4(3): ooab085, 2021 Jul.
Article in English | MEDLINE | ID: mdl-34604711

ABSTRACT

OBJECTIVE: We develop natural language processing (NLP) methods capable of accurately classifying tumor attributes from pathology reports given minimal labeled examples. Our hierarchical cancer to cancer transfer (HCTC) and zero-shot string similarity (ZSS) methods are designed to exploit shared information between cancers and auxiliary class features, respectively, to boost performance using enriched annotations which give both location-based information and document level labels for each pathology report. MATERIALS AND METHODS: Our data consists of 250 pathology reports each for kidney, colon, and lung cancer from 2002 to 2019 from a single institution (UCSF). For each report, we classified 5 attributes: procedure, tumor location, histology, grade, and presence of lymphovascular invasion. We develop novel NLP techniques involving transfer learning and string similarity trained on enriched annotations. We compare HCTC and ZSS methods to the state-of-the-art including conventional machine learning methods as well as deep learning methods. RESULTS: For our HCTC method, we see an improvement of up to 0.1 micro-F1 score and 0.04 macro-F1 averaged across cancer and applicable attributes. For our ZSS method, we see an improvement of up to 0.26 micro-F1 and 0.23 macro-F1 averaged across cancer and applicable attributes. These comparisons are made after adjusting training data sizes to correct for the 20% increase in annotation time for enriched annotations compared to ordinary annotations. CONCLUSIONS: Methods based on transfer learning across cancers and augmenting information methods with string similarity priors can significantly reduce the amount of labeled data needed for accurate information extraction from pathology reports.

3.
J Biomed Inform ; 122: 103872, 2021 10.
Article in English | MEDLINE | ID: mdl-34411709

ABSTRACT

OBJECTIVE: We aim to build an accurate machine learning-based system for classifying tumor attributes from cancer pathology reports in the presence of a small amount of annotated data, motivated by the expensive and time-consuming nature of pathology report annotation. An enriched labeling scheme that includes the location of relevant information along with the final label is used along with a corresponding hierarchical method for classifying reports that leverages these enriched annotations. MATERIALS AND METHODS: Our data consists of 250 colon cancer and 250 kidney cancer pathology reports from 2002 to 2019 at the University of California, San Francisco. For each report, we classify attributes such as procedure performed, tumor grade, and tumor site. For each attribute and document, an annotator trained by an oncologist labeled both the value of that attribute as well as the specific lines in the document that indicated the value. We develop a model that uses these enriched annotations that first predicts the relevant lines of the document, then predicts the final value given the predicted lines. We compare our model to multiple state-of-the-art methods for classifying tumor attributes from pathology reports. RESULTS: Our results show that across colon and kidney cancers and varying training set sizes, our hierarchical method consistently outperforms state-of-the-art methods. Furthermore, performance comparable to these methods can be achieved with approximately half the amount of labeled data. CONCLUSION: Document annotations that are enriched with location information are shown to greatly increase the sample efficiency of machine learning methods for classifying attributes of pathology reports.


Subject(s)
Neoplasms , Attention , Humans , Machine Learning , Research Report
4.
JAMIA Open ; 3(3): 431-438, 2020 Oct.
Article in English | MEDLINE | ID: mdl-33381748

ABSTRACT

OBJECTIVE: Cancer is a leading cause of death, but much of the diagnostic information is stored as unstructured data in pathology reports. We aim to improve uncertainty estimates of machine learning-based pathology parsers and evaluate performance in low data settings. MATERIALS AND METHODS: Our data comes from the Urologic Outcomes Database at UCSF which includes 3232 annotated prostate cancer pathology reports from 2001 to 2018. We approach 17 separate information extraction tasks, involving a wide range of pathologic features. To handle the diverse range of fields, we required 2 statistical models, a document classification method for pathologic features with a small set of possible values and a token extraction method for pathologic features with a large set of values. For each model, we used isotonic calibration to improve the model's estimates of its likelihood of being correct. RESULTS: Our best document classifier method, a convolutional neural network, achieves a weighted F1 score of 0.97 averaged over 12 fields and our best extraction method achieves an accuracy of 0.93 averaged over 5 fields. The performance saturates as a function of dataset size with as few as 128 data points. Furthermore, while our document classifier methods have reliable uncertainty estimates, our extraction-based methods do not, but after isotonic calibration, expected calibration error drops to below 0.03 for all extraction fields. CONCLUSIONS: We find that when applying machine learning to pathology parsing, large datasets may not always be needed, and that calibration methods can improve the reliability of uncertainty estimates.

5.
6.
J Comput Neurosci ; 41(2): 143-55, 2016 10.
Article in English | MEDLINE | ID: mdl-27272510

ABSTRACT

One of the most common examples of audiovisual speech integration is the McGurk effect. As an example, an auditory syllable /ba/ recorded over incongruent lip movements that produce "ga" typically causes listeners to hear "da". This report hypothesizes reasons why certain clinical and listeners who are hard of hearing might be more susceptible to visual influence. Conversely, we also examine why other listeners appear less susceptible to the McGurk effect (i.e., they report hearing just the auditory stimulus without being influenced by the visual). Such explanations are accompanied by a mechanistic explanation of integration phenomena including visual inhibition of auditory information, or slower rate of accumulation of inputs. First, simulations of a linear dynamic parallel interactive model were instantiated using inhibition and facilitation to examine potential mechanisms underlying integration. In a second set of simulations, we systematically manipulated the inhibition parameter values to model data obtained from listeners with autism spectrum disorder. In summary, we argue that cross-modal inhibition parameter values explain individual variability in McGurk perceptibility. Nonetheless, different mechanisms should continue to be explored in an effort to better understand current data patterns in the audiovisual integration literature.


Subject(s)
Autism Spectrum Disorder , Models, Neurological , Speech Perception , Acoustic Stimulation , Humans , Visual Perception
7.
Atten Percept Psychophys ; 78(6): 1712-27, 2016 08.
Article in English | MEDLINE | ID: mdl-27188651

ABSTRACT

This paper proposes a novel approach to assess audiovisual integration for both congruent and incongruent speech stimuli using reaction times (RT). The experiments are based on the McGurk effect, in which a listener is presented with incongruent audiovisual speech signals. A typical example involves the auditory consonant/b/combined with a visually articulated/g/, often yielding a perception of/d/. We quantify the amount of integration relative to the predictions of a parallel independent model as a function of attention and congruency between auditory and visual signals. We assessed RT distributions for congruent and incongruent auditory and visual signals in a within-subjects signal detection paradigm under conditions of divided versus focused attention. Results showed that listeners often received only minimal benefit from congruent auditory visual stimuli, even when such information could have improved performance. Incongruent stimuli adversely affected performance in divided and focused attention conditions. Our findings support a parallel model of auditory-visual integration with interactions between auditory and visual channels.


Subject(s)
Attention , Reaction Time , Speech Perception/physiology , Visual Perception/physiology , Acoustic Stimulation/methods , Female , Humans , Male , Photic Stimulation/methods , Young Adult
8.
Am J Psychol ; 129(1): 11-21, 2016.
Article in English | MEDLINE | ID: mdl-27029103

ABSTRACT

What factors contribute to redundant target processing speed besides statistical facilitation? One possibility is that multiple percepts may drive these effects. Another, although not mutually exclusive hypothesis, is that cross-channel cueing from one modality to another may influence response times. We implemented an auditory-visual detection task using the sound-induced flash illusion to examine whether one or both of these possibilities contributes to changes in processing speed; we did so by examining the data of individual participants. Our results indicated shorter response times in several participants when multiple flashes were perceived in the standard sound-induced flash illusion, thereby replicating previous work in the literature. Additionally, we found evidence for faster responses in several participants when carrying out the same analysis in trials in which 1 beep was presented with 2 real flashes. Overall, our analysis indicates that some observers benefit from cross-modal facilitation, whereas others may benefit from a combination of cross-modal facilitation and increased perceptual judgments.


Subject(s)
Auditory Perception/physiology , Psychomotor Performance/physiology , Visual Perception/physiology , Adult , Cues , Female , Humans , Male , Young Adult
9.
Int J Audiol ; 55(4): 206-14, 2016.
Article in English | MEDLINE | ID: mdl-26853446

ABSTRACT

OBJECTIVE: The ability to use visual speech cues and integrate them with auditory information is important, especially in noisy environments and for hearing-impaired (HI) listeners. Providing data on measures of integration skills that encompass accuracy and processing speed will benefit researchers and clinicians. DESIGN: The study consisted of two experiments: First, accuracy scores were obtained using City University of New York (CUNY) sentences, and capacity measures that assessed reaction-time distributions were obtained from a monosyllabic word recognition task. STUDY SAMPLE: We report data on two measures of integration obtained from a sample comprised of 86 young and middle-age adult listeners: RESULTS: To summarize our results, capacity showed a positive correlation with accuracy measures of audiovisual benefit obtained from sentence recognition. More relevant, factor analysis indicated that a single-factor model captured audiovisual speech integration better than models containing more factors. Capacity exhibited strong loadings on the factor, while the accuracy-based measures from sentence recognition exhibited weaker loadings. CONCLUSIONS: Results suggest that a listener's integration skills may be assessed optimally using a measure that incorporates both processing speed and accuracy.


Subject(s)
Cues , Recognition, Psychology , Speech Perception , Visual Perception , Acoustic Stimulation , Adolescent , Adult , Audiometry, Pure-Tone , Audiometry, Speech , Auditory Threshold , Factor Analysis, Statistical , Female , Humans , Male , Middle Aged , Pattern Recognition, Physiological , Reaction Time , Young Adult
11.
Brain Topogr ; 28(3): 479-93, 2015 May.
Article in English | MEDLINE | ID: mdl-24276220

ABSTRACT

The ability to effectively combine sensory inputs across modalities is vital for acquiring a unified percept of events. For example, watching a hammer hit a nail while simultaneously identifying the sound as originating from the event requires the ability to identify spatio-temporal congruencies and statistical regularities. In this study, we applied a reaction time and hazard function measure known as capacity (e.g., Townsend and AshbyCognitive Theory 200-239, 1978) to quantify the extent to which observers learn paired associations between simple auditory and visual patterns in a model theoretic manner. As expected, results showed that learning was associated with an increase in accuracy, but more significantly, an increase in capacity. The aim of this study was to associate capacity measures of multisensory learning, with neural based measures, namely mean global field power (GFP). We observed a co-variation between an increase in capacity, and a decrease in GFP amplitude as learning occurred. This suggests that capacity constitutes a reliable behavioral index of efficient energy expenditure in the neural domain.


Subject(s)
Association Learning/physiology , Auditory Perception/physiology , Brain/physiology , Visual Perception/physiology , Acoustic Stimulation/methods , Electroencephalography , Humans , Photic Stimulation/methods , Young Adult
12.
Front Psychol ; 5: 678, 2014.
Article in English | MEDLINE | ID: mdl-25071649

ABSTRACT

Research in audiovisual speech perception has demonstrated that sensory factors such as auditory and visual acuity are associated with a listener's ability to extract and combine auditory and visual speech cues. This case study report examined audiovisual integration using a newly developed measure of capacity in a sample of hearing-impaired listeners. Capacity assessments are unique because they examine the contribution of reaction-time (RT) as well as accuracy to determine the extent to which a listener efficiently combines auditory and visual speech cues relative to independent race model predictions. Multisensory speech integration ability was examined in two experiments: an open-set sentence recognition and a closed set speeded-word recognition study that measured capacity. Most germane to our approach, capacity illustrated speed-accuracy tradeoffs that may be predicted by audiometric configuration. Results revealed that some listeners benefit from increased accuracy, but fail to benefit in terms of speed on audiovisual relative to unisensory trials. Conversely, other listeners may not benefit in the accuracy domain but instead show an audiovisual processing time benefit.

13.
Int J Audiol ; 53(10): 710-8, 2014 Oct.
Article in English | MEDLINE | ID: mdl-24806080

ABSTRACT

OBJECTIVE: While most normal-hearing listeners rely on the auditory modality to obtain speech information, research has demonstrated the importance that non-auditory modalities have on language recognition during face-to-face communication. The efficient utilization of the visual modality becomes increasingly important in difficult listening conditions, and especially for older and hearing-impaired listeners with sensory or cognitive decline. First, this report will quantify audiovisual integration skills using a recently developed capacity measure that incorporates speed and accuracy. Second, to investigate sensory factors contributing to integration ability, high and low-frequency hearing thresholds will be correlated with capacity, as well as gain measures from sentence recognition. DESIGN: Integration scores were obtained from a within-subjects design using an open-set sentence speech recognition experiment and a closed set speeded-word classification experiment, designed to examine integration (i.e. capacity). STUDY SAMPLE: A sample of 44 adult listeners without a self-reported history of hearing-loss was recruited. RESULTS: RESULTS demonstrated a significant relationship between measures of audiovisual integration and hearing thresholds. CONCLUSIONS: Our data indicated that a listener's ability to integrate auditory and visual speech information in the domains of speed and accuracy is associated with auditory sensory capabilities and possibly other sensory and cognitive factors.


Subject(s)
Audiometry, Speech , Speech Perception , Visual Perception , Adolescent , Adult , Audiometry, Pure-Tone , Female , Humans , Male , Middle Aged , Young Adult
16.
Brain Topogr ; 27(6): 707-30, 2014 Nov.
Article in English | MEDLINE | ID: mdl-24722880

ABSTRACT

We process information from the world through multiple senses, and the brain must decide what information belongs together and what information should be segregated. One challenge in studying such multisensory integration is how to quantify the multisensory interactions, a challenge that is amplified by the host of methods that are now used to measure neural, behavioral, and perceptual responses. Many of the measures that have been developed to quantify multisensory integration (and which have been derived from single unit analyses), have been applied to these different measures without much consideration for the nature of the process being studied. Here, we provide a review focused on the means with which experimenters quantify multisensory processes and integration across a range of commonly used experimental methodologies. We emphasize the most commonly employed measures, including single- and multiunit responses, local field potentials, functional magnetic resonance imaging, and electroencephalography, along with behavioral measures of detection, accuracy, and response times. In each section, we will discuss the different metrics commonly used to quantify multisensory interactions, including the rationale for their use, their advantages, and the drawbacks and caveats associated with them. Also discussed are possible alternatives to the most commonly used metrics.


Subject(s)
Brain Mapping/methods , Brain/physiology , Neurons/physiology , Perception/physiology , Animals , Data Interpretation, Statistical , Electroencephalography/methods , Humans , Magnetic Resonance Imaging/methods
17.
Behav Res Methods ; 46(2): 406-15, 2014 Jun.
Article in English | MEDLINE | ID: mdl-23943582

ABSTRACT

We propose a measure of audiovisual speech integration that takes into account accuracy and response times. This measure should prove beneficial for researchers investigating multisensory speech recognition, since it relates to normal-hearing and aging populations. As an example, age-related sensory decline influences both the rate at which one processes information and the ability to utilize cues from different sensory modalities. Our function assesses integration when both auditory and visual information are available, by comparing performance on these audiovisual trials with theoretical predictions for performance under the assumptions of parallel, independent self-terminating processing of single-modality inputs. We provide example data from an audiovisual identification experiment and discuss applications for measuring audiovisual integration skills across the life span.


Subject(s)
Aging/physiology , Models, Psychological , Reaction Time/physiology , Speech Perception/physiology , Visual Perception/physiology , Adult , Aged , Audiovisual Aids , Cues , Female , Humans , Middle Aged , Statistics, Nonparametric , Young Adult
20.
Front Psychol ; 4: 615, 2013.
Article in English | MEDLINE | ID: mdl-24058358

ABSTRACT

Speech perception engages both auditory and visual modalities. Limitations of traditional accuracy-only approaches in the investigation of audiovisual speech perception have motivated the use of new methodologies. In an audiovisual speech identification task, we utilized capacity (Townsend and Nozawa, 1995), a dynamic measure of efficiency, to quantify audiovisual integration. Capacity was used to compare RT distributions from audiovisual trials to RT distributions from auditory-only and visual-only trials across three listening conditions: clear auditory signal, S/N ratio of -12 dB, and S/N ratio of -18 dB. The purpose was to obtain EEG recordings in conjunction with capacity to investigate how a late ERP co-varies with integration efficiency. Results showed efficient audiovisual integration for low auditory S/N ratios, but inefficient audiovisual integration when the auditory signal was clear. The ERP analyses showed evidence for greater audiovisual amplitude compared to the unisensory signals for lower auditory S/N ratios (higher capacity/efficiency) compared to the high S/N ratio (low capacity/inefficient integration). The data are consistent with an interactive framework of integration, where auditory recognition is influenced by speech-reading as a function of signal clarity.

SELECTION OF CITATIONS
SEARCH DETAIL
...