Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 33
Filter
1.
Int J Comput Vis ; 132(3): 854-871, 2024.
Article in English | MEDLINE | ID: mdl-38371492

ABSTRACT

Predicting human's gaze from egocentric videos serves as a critical role for human intention understanding in daily activities. In this paper, we present the first transformer-based model to address the challenging problem of egocentric gaze estimation. We observe that the connection between the global scene context and local visual information is vital for localizing the gaze fixation from egocentric video frames. To this end, we design the transformer encoder to embed the global context as one additional visual token and further propose a novel global-local correlation module to explicitly model the correlation of the global token and each local token. We validate our model on two egocentric video datasets - EGTEA Gaze + and Ego4D. Our detailed ablation studies demonstrate the benefits of our method. In addition, our approach exceeds the previous state-of-the-art model by a large margin. We also apply our model to a novel gaze saccade/fixation prediction task and the traditional action recognition problem. The consistent gains suggest the strong generalization capability of our model. We also provide additional visualizations to support our claim that global-local correlation serves a key representation for predicting gaze fixation from egocentric videos. More details can be found in our website (https://bolinlai.github.io/GLC-EgoGazeEst).

2.
IEEE Trans Pattern Anal Mach Intell ; 45(6): 6731-6747, 2023 06.
Article in English | MEDLINE | ID: mdl-33449877

ABSTRACT

We address the task of jointly determining what a person is doing and where they are looking based on the analysis of video captured by a headworn camera. To facilitate our research, we first introduce the EGTEA Gaze+ dataset. Our dataset comes with videos, gaze tracking data, hand masks and action annotations, thereby providing the most comprehensive benchmark for First Person Vision (FPV). Moving beyond the dataset, we propose a novel deep model for joint gaze estimation and action recognition in FPV. Our method describes the participant's gaze as a probabilistic variable and models its distribution using stochastic units in a deep network. We further sample from these stochastic units, generating an attention map to guide the aggregation of visual features for action recognition. Our method is evaluated on our EGTEA Gaze+ dataset and achieves a performance level that exceeds the state-of-the-art by a significant margin. More importantly, we demonstrate that our model can be applied to larger scale FPV dataset-EPIC-Kitchens even without using gaze, offering new state-of-the-art results on FPV action recognition.


Subject(s)
Algorithms , Hand , Humans , Attention , Video Recording , Fixation, Ocular
3.
Article in English | MEDLINE | ID: mdl-36873428

ABSTRACT

Passive detection of risk factors (that may influence unhealthy or adverse behaviors) via wearable and mobile sensors has created new opportunities to improve the effectiveness of behavioral interventions. A key goal is to find opportune moments for intervention by passively detecting rising risk of an imminent adverse behavior. But, it has been difficult due to substantial noise in the data collected by sensors in the natural environment and a lack of reliable label assignment of low- and high-risk states to the continuous stream of sensor data. In this paper, we propose an event-based encoding of sensor data to reduce the effect of noises and then present an approach to efficiently model the historical influence of recent and past sensor-derived contexts on the likelihood of an adverse behavior. Next, to circumvent the lack of any confirmed negative labels (i.e., time periods with no high-risk moment), and only a few positive labels (i.e., detected adverse behavior), we propose a new loss function. We use 1,012 days of sensor and self-report data collected from 92 participants in a smoking cessation field study to train deep learning models to produce a continuous risk estimate for the likelihood of an impending smoking lapse. The risk dynamics produced by the model show that risk peaks an average of 44 minutes before a lapse. Simulations on field study data show that using our model can create intervention opportunities for 85% of lapses with 5.5 interventions per day.

4.
Contemp Clin Trials ; 110: 106513, 2021 11.
Article in English | MEDLINE | ID: mdl-34314855

ABSTRACT

Smoking is the leading preventable cause of death and disability in the U.S. Empirical evidence suggests that engaging in evidence-based self-regulatory strategies (e.g., behavioral substitution, mindful attention) can improve smokers' ability to resist craving and build self-regulatory skills. However, poor engagement represents a major barrier to maximizing the impact of self-regulatory strategies. This paper describes the protocol for Mobile Assistance for Regulating Smoking (MARS) - a research study designed to inform the development of a mobile health (mHealth) intervention for promoting real-time, real-world engagement in evidence-based self-regulatory strategies. The study will employ a 10-day Micro-Randomized Trial (MRT) enrolling 112 smokers attempting to quit. Utilizing a mobile smoking cessation app, the MRT will randomize each individual multiple times per day to either: (a) no intervention prompt; (b) a prompt recommending brief (low effort) cognitive and/or behavioral self-regulatory strategies; or (c) a prompt recommending more effortful cognitive or mindfulness-based strategies. Prompts will be delivered via push notifications from the MARS mobile app. The goal is to investigate whether, what type of, and under what conditions prompting the individual to engage in self-regulatory strategies increases engagement. The results will build the empirical foundation necessary to develop a mHealth intervention that effectively utilizes intensive longitudinal self-report and sensor-based assessments of emotions, context and other factors to engage an individual in the type of self-regulatory activity that would be most beneficial given their real-time, real-world circumstances. This type of mHealth intervention holds enormous potential to expand the reach and impact of smoking cessation treatments.


Subject(s)
Mobile Applications , Smoking Cessation , Humans , Motivation , Randomized Controlled Trials as Topic , Smokers , Smoking
5.
Nat Commun ; 11(1): 6386, 2020 12 14.
Article in English | MEDLINE | ID: mdl-33318484

ABSTRACT

Eye contact is among the most primary means of social communication used by humans. Quantification of eye contact is valuable as a part of the analysis of social roles and communication skills, and for clinical screening. Estimating a subject's looking direction is a challenging task, but eye contact can be effectively captured by a wearable point-of-view camera which provides a unique viewpoint. While moments of eye contact from this viewpoint can be hand-coded, such a process tends to be laborious and subjective. In this work, we develop a deep neural network model to automatically detect eye contact in egocentric video. It is the first to achieve accuracy equivalent to that of human experts. We train a deep convolutional network using a dataset of 4,339,879 annotated images, consisting of 103 subjects with diverse demographic backgrounds. 57 subjects have a diagnosis of Autism Spectrum Disorder. The network achieves overall precision of 0.936 and recall of 0.943 on 18 validation subjects, and its performance is on par with 10 trained human coders with a mean precision 0.918 and recall 0.946. Our method will be instrumental in gaze behavior analysis by serving as a scalable, objective, and accessible tool for clinicians and researchers.


Subject(s)
Communication , Deep Learning , Eye , Neural Networks, Computer , Autism Spectrum Disorder , Child, Preschool , Female , Hand , Humans , Infant , Machine Learning , Male , Models, Theoretical
6.
IEEE J Biomed Health Inform ; 24(7): 1899-1906, 2020 07.
Article in English | MEDLINE | ID: mdl-31940570

ABSTRACT

OBJECTIVE: Left ventricular assist devices (LVADs) fail in up to 10% of patients due to the development of pump thrombosis. Remote monitoring of patients with LVADs can enable early detection and, subsequently, treatment and prevention of pump thrombosis. We assessed whether acoustical signals measured on the chest of patients with LVADs, combined with machine learning algorithms, can be used for detecting pump thrombosis. METHODS: 13 centrifugal pump (HVAD) recipients were enrolled in the study. When hospitalized for suspected pump thrombosis, clinical data and acoustical recordings were obtained at admission, prior to and after administration of thrombolytic therapy, and every 24 hours until laboratory and pump parameters normalized. First, we selected the most important features among our feature set using LDH-based correlation analysis. Then using these features, we trained a logistic regression model and determined our decision threshold to differentiate between thrombosis and non-thrombosis episodes. RESULTS: Accuracy, sensitivity and precision were calculated to be 88.9%, 90.9% and 83.3%, respectively. When tested on the post-thrombolysis data, our algorithm suggested possible pump abnormalities that were not identified by the reference pump power or biomarker abnormalities. SIGNIFICANCE: We showed that the acoustical signatures of LVADs can be an index of mechanical deterioration and, when combined with machine learning algorithms, provide clinical decision support regarding the presence of pump thrombosis.


Subject(s)
Heart Sounds/physiology , Heart-Assist Devices/adverse effects , Signal Processing, Computer-Assisted , Thrombosis/diagnosis , Acoustics , Aged , Algorithms , Female , Humans , Male , Middle Aged , Sound Spectrography , Stethoscopes
7.
Adv Neural Inf Process Syst ; 33: 19828-19838, 2020 Dec.
Article in English | MEDLINE | ID: mdl-34103881

ABSTRACT

Panel count data describes aggregated counts of recurrent events observed at discrete time points. To understand dynamics of health behaviors and predict future negative events, the field of quantitative behavioral research has evolved to increasingly rely upon panel count data collected via multiple self reports, for example, about frequencies of smoking using in-the-moment surveys on mobile devices. However, missing reports are common and present a major barrier to downstream statistical learning. As a first step, under a missing completely at random assumption (MCAR), we propose a simple yet widely applicable functional EM algorithm to estimate the counting process mean function, which is of central interest to behavioral scientists. The proposed approach wraps several popular panel count inference methods, seamlessly deals with incomplete counts and is robust to misspecification of the Poisson process assumption. Theoretical analysis of the proposed algorithm provides finite-sample guarantees by expanding parametric EM theory [3, 34] to the general non-parametric setting. We illustrate the utility of the proposed algorithm through numerical experiments and an analysis of smoking cessation data. We also discuss useful extensions to address deviations from the MCAR assumption and covariate effects.

8.
Article in English | MEDLINE | ID: mdl-34651096

ABSTRACT

Context plays a key role in impulsive adverse behaviors such as fights, suicide attempts, binge-drinking, and smoking lapse. Several contexts dissuade such behaviors, but some may trigger adverse impulsive behaviors. We define these latter contexts as 'opportunity' contexts, as their passive detection from sensors can be used to deliver context-sensitive interventions. In this paper, we define the general concept of 'opportunity' contexts and apply it to the case of smoking cessation. We operationalize the smoking 'opportunity' context, using self-reported smoking allowance and cigarette availability. We show its clinical utility by establishing its association with smoking occurrences using Granger causality. Next, we mine several informative features from GPS traces, including the novel location context of smoking spots, to develop the SmokingOpp model for automatically detecting the smoking 'opportunity' context. Finally, we train and evaluate the SmokingOpp model using 15 million GPS points and 3,432 self-reports from 90 newly abstinent smokers in a smoking cessation study.

9.
IEEE Trans Biomed Eng ; 67(5): 1303-1313, 2020 05.
Article in English | MEDLINE | ID: mdl-31425011

ABSTRACT

OBJECTIVE: To improve home monitoring of heart failure patients so as to reduce emergency room visits and hospital readmissions. We aim to do this by analyzing the ballistocardiogram (BCG) to evaluate the clinical state of the patient. METHODS: 1) High quality BCG signals were collected at home from HF patients after discharge. 2) The BCG recordings were preprocessed to exclude outliers and artifacts. 3) Parameters of the BCG that contain information about the cardiovascular system were extracted. These features were used for the task of classification of the BCG recording based on the status of HF. RESULTS: The best AUC score for the task of classification obtained was 0.78 using slight variant of the leave one subject out validation method. CONCLUSION: This work demonstrates that high quality BCG signals can be collected in a home environment and used to detect the clinical state of HF patients. SIGNIFICANCE: In future work, a clinician/caregiver can be introduced into the system so that appropriate interventions can be performed based on the clinical state monitored at home.


Subject(s)
Ballistocardiography , Heart Failure , Artifacts , Heart Failure/diagnosis , Humans , Monitoring, Physiologic
10.
Psychometrika ; 83(2): 476-510, 2018 06.
Article in English | MEDLINE | ID: mdl-29557080

ABSTRACT

A growing number of social scientists have turned to differential equations as a tool for capturing the dynamic interdependence among a system of variables. Current tools for fitting differential equation models do not provide a straightforward mechanism for diagnosing evidence for qualitative shifts in dynamics, nor do they provide ways of identifying the timing and possible determinants of such shifts. In this paper, we discuss regime-switching differential equation models, a novel modeling framework for representing abrupt changes in a system of differential equation models. Estimation was performed by combining the Kim filter (Kim and Nelson State-space models with regime switching: classical and Gibbs-sampling approaches with applications, MIT Press, Cambridge, 1999) and a numerical differential equation solver that can handle both ordinary and stochastic differential equations. The proposed approach was motivated by the need to represent discrete shifts in the movement dynamics of [Formula: see text] mother-infant dyads during the Strange Situation Procedure (SSP), a behavioral assessment where the infant is separated from and reunited with the mother twice. We illustrate the utility of a novel regime-switching differential equation model in representing children's tendency to exhibit shifts between the goal of staying close to their mothers and intermittent interest in moving away from their mothers to explore the room during the SSP. Results from empirical model fitting were supplemented with a Monte Carlo simulation study to evaluate the use of information criterion measures to diagnose sudden shifts in dynamics.


Subject(s)
Psychometrics/methods , Computer Simulation , Female , Head Movements , Humans , Infant , Infant Behavior , Monte Carlo Method , Mother-Child Relations/psychology , Social Sciences/methods , Software , Stochastic Processes
11.
Child Dev ; 89(2): e60-e73, 2018 03.
Article in English | MEDLINE | ID: mdl-28295208

ABSTRACT

Children's early language environments are related to later development. Little is known about this association in siblings of children with autism spectrum disorder (ASD), who often experience language delays or have ASD. Fifty-nine 9-month-old infants at high or low familial risk for ASD contributed full-day in-home language recordings. High-risk infants produced more vocalizations than low-risk peers; conversational turns and adult words did not differ by group. Vocalization differences were driven by a subgroup of "hypervocal" infants. Despite more vocalizations overall, these infants engaged in less social babbling during a standardized clinic assessment, and they experienced fewer conversational turns relative to their rate of vocalizations. Two ways in which these individual and environmental differences may relate to subsequent development are discussed.


Subject(s)
Autism Spectrum Disorder/physiopathology , Child Development/physiology , Infant Behavior/physiology , Siblings , Social Behavior , Verbal Behavior/physiology , Female , Humans , Infant , Male , Risk , Signal Processing, Computer-Assisted
13.
J Autism Dev Disord ; 47(3): 898-904, 2017 Mar.
Article in English | MEDLINE | ID: mdl-28070783

ABSTRACT

Children with autism spectrum disorder (ASD) show reduced gaze to social partners. Eye contact during live interactions is often measured using stationary cameras that capture various views of the child, but determining a child's precise gaze target within another's face is nearly impossible. This study compared eye gaze coding derived from stationary cameras to coding derived from a "point-of-view" (PoV) camera on the social partner. Interobserver agreement for gaze targets was higher using PoV cameras relative to stationary cameras. PoV camera codes, but not stationary cameras codes, revealed a difference between gaze targets of children with ASD and typically developing children. PoV cameras may provide a more sensitive method for measuring eye contact in children with ASD during live interactions.


Subject(s)
Autism Spectrum Disorder/physiopathology , Eye Movement Measurements/instrumentation , Fixation, Ocular , Interpersonal Relations , Photography/instrumentation , Autism Spectrum Disorder/psychology , Child , Female , Humans , Male , Photography/methods , Pilot Projects
14.
Proc Mach Learn Res ; 70: 970-979, 2017.
Article in English | MEDLINE | ID: mdl-30906932

ABSTRACT

An important mobile health (mHealth) task is the use of multimodal data, such as sensor streams and self-report, to construct interpretable time-to-event predictions of, for example, lapse to alcohol or illicit drug use. Interpretability of the prediction model is important for acceptance and adoption by domain scientists, enabling model outputs and parameters to inform theory and guide intervention design. Temporal latent state models are therefore attractive, and so we adopt the continuous time hidden Markov model (CT-HMM) due to its ability to describe irregular arrival times of event data. Standard CT-HMMs, however, are not specialized for predicting the time to a future event, the key variable for mHealth interventions. Also, standard emission models lack a sufficiently rich structure to describe multimodal data and incorporate domain knowledge. We present iSurvive, an extension of classical survival analysis to a CT-HMM. We present a parameter learning method for GLM emissions and survival model fitting, and present promising results on both synthetic data and an mHealth drug use dataset.

15.
J Autism Dev Disord ; 47(3): 607-614, 2017 Mar.
Article in English | MEDLINE | ID: mdl-27987063

ABSTRACT

Children with autism have atypical gaze behavior but it is unknown whether gaze differs during distinct types of reciprocal interactions. Typically developing children (N = 20) and children with autism (N = 20) (4-13 years) made similar amounts of eye contact with an examiner during a conversation. Surprisingly, there was minimal eye contact during interactive play in both groups. Gaze behavior was stable across 8 weeks in children with autism (N = 15). Lastly, gaze behavior during conversation but not play was associated with autism social affect severity scores (ADOS CSS SA) and the Social Responsiveness Scale (SRS-2). Together findings suggests that eye contact in typical and atypical development is influenced by subtle changes in context, which has implications for optimizing assessments of social communication skills.


Subject(s)
Autistic Disorder/physiopathology , Communication , Fixation, Ocular/physiology , Play and Playthings , Social Skills , Adolescent , Case-Control Studies , Child , Child, Preschool , Female , Humans , Male
16.
Article in English | MEDLINE | ID: mdl-27872373

ABSTRACT

We offer a new solution to the unsolved problem of how infants break into word learning based on the visual statistics of everyday infant-perspective scenes. Images from head camera video captured by 8 1/2 to 10 1/2 month-old infants at 147 at-home mealtime events were analysed for the objects in view. The images were found to be highly cluttered with many different objects in view. However, the frequency distribution of object categories was extremely right skewed such that a very small set of objects was pervasively present-a fact that may substantially reduce the problem of referential ambiguity. The statistical structure of objects in these infant egocentric scenes differs markedly from that in the training sets used in computational models and in experiments on statistical word-referent learning. Therefore, the results also indicate a need to re-examine current explanations of how infants break into word learning.This article is part of the themed issue 'New frontiers for statistical learning in the cognitive sciences'.


Subject(s)
Language Development , Verbal Learning , Visual Perception , Female , Humans , Infant , Male
17.
J Am Med Inform Assoc ; 22(6): 1137-42, 2015 Nov.
Article in English | MEDLINE | ID: mdl-26555017

ABSTRACT

Mobile sensor data-to-knowledge (MD2K) was chosen as one of 11 Big Data Centers of Excellence by the National Institutes of Health, as part of its Big Data-to-Knowledge initiative. MD2K is developing innovative tools to streamline the collection, integration, management, visualization, analysis, and interpretation of health data generated by mobile and wearable sensors. The goal of the big data solutions being developed by MD2K is to reliably quantify physical, biological, behavioral, social, and environmental factors that contribute to health and disease risk. The research conducted by MD2K is targeted at improving health through early detection of adverse health events and by facilitating prevention. MD2K will make its tools, software, and training materials widely available and will also organize workshops and seminars to encourage their use by researchers and clinicians.


Subject(s)
Biomedical Research/instrumentation , Datasets as Topic , Telemedicine/instrumentation , Telemetry , Geographic Information Systems/instrumentation , Humans , National Institutes of Health (U.S.) , United States
18.
J Autism Dev Disord ; 45(12): 3900-4, 2015 Dec.
Article in English | MEDLINE | ID: mdl-26481385

ABSTRACT

A gap exists between the expanding space of technological innovations to aid those affected by autism spectrum disorders, and the actual impact of those technologies on daily lives. This gap can be addressed through a very practical path of commercialization. However, the path from a technological innovation to a commercially viable product is fraught with challenges. These challenges can be mitigated through small business funding agencies, which are, more and more, catalyzing the dissemination of innovation by fostering social entrepreneurship through capital support and venture philanthropy. This letter describes the differences and nature of these agencies, and their importance in facilitating the translational and real-world impact of technological and scientific discoveries.


Subject(s)
Autism Spectrum Disorder/economics , Diffusion of Innovation , Small Business , Autism Spectrum Disorder/rehabilitation , Humans
19.
Adv Neural Inf Process Syst ; 28: 3599-3607, 2015.
Article in English | MEDLINE | ID: mdl-27019571

ABSTRACT

The Continuous-Time Hidden Markov Model (CT-HMM) is an attractive approach to modeling disease progression due to its ability to describe noisy observations arriving irregularly in time. However, the lack of an efficient parameter learning algorithm for CT-HMM restricts its use to very small models or requires unrealistic constraints on the state transitions. In this paper, we present the first complete characterization of efficient EM-based learning methods for CT-HMM models. We demonstrate that the learning problem consists of two challenges: the estimation of posterior state probabilities and the computation of end-state conditioned statistics. We solve the first challenge by reformulating the estimation problem in terms of an equivalent discrete time-inhomogeneous hidden Markov model. The second challenge is addressed by adapting three approaches from the continuous time Markov chain literature to the CT-HMM domain. We demonstrate the use of CT-HMMs with more than 100 states to visualize and predict disease progression using a glaucoma dataset and an Alzheimer's disease dataset.

20.
Article in English | MEDLINE | ID: mdl-26973427

ABSTRACT

We address the challenging problem of recognizing the camera wearer's actions from videos captured by an egocentric camera. Egocentric videos encode a rich set of signals regarding the camera wearer, including head movement, hand pose and gaze information. We propose to utilize these mid-level egocentric cues for egocentric action recognition. We present a novel set of egocentric features and show how they can be combined with motion and object features. The result is a compact representation with superior performance. In addition, we provide the first systematic evaluation of motion, object and egocentric cues in egocentric action recognition. Our benchmark leads to several surprising findings. These findings uncover the best practices for egocentric actions, with a significant performance boost over all previous state-of-the-art methods on three publicly available datasets.

SELECTION OF CITATIONS
SEARCH DETAIL
...