Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 14 de 14
Filter
Add more filters










Publication year range
1.
Article in English | MEDLINE | ID: mdl-36044495

ABSTRACT

Current methods for segmenting eye imagery into skin, sclera, pupil, and iris cannot leverage information about eye motion. This is because the datasets on which models are trained are limited to temporally non-contiguous frames. We present Temporal RIT-Eyes, a Blender pipeline that draws data from real eye videos for the rendering of synthetic imagery depicting natural gaze dynamics. These sequences are accompanied by ground-truth segmentation maps that may be used for training image-segmentation networks. Temporal RIT-Eyes relies on a novel method for the extraction of 3D eyelid pose (top and bottom apex of eyelids/eyeball boundary) from raw eye images for the rendering of gaze-dependent eyelid pose and blink behavior. The pipeline is parameterized to vary in appearance, eye/head/camera/illuminant geometry, and environment settings (indoor/outdoor). We present two open-source datasets of synthetic eye imagery: sGiW is a set of synthetic-image sequences whose dynamics are modeled on those of the Gaze in Wild dataset, and sOpenEDS2 is a series of temporally non-contiguous eye images that approximate the OpenEDS-2019 dataset. We also analyze and demonstrate the quality of the rendered dataset qualitatively and show significant overlap between latent-space representations of the source and the rendered datasets.

2.
IEEE Trans Vis Comput Graph ; 27(5): 2757-2767, 2021 05.
Article in English | MEDLINE | ID: mdl-33780339

ABSTRACT

Ellipse fitting, an essential component in pupil or iris tracking based video oculography, is performed on previously segmented eye parts generated using various computer vision techniques. Several factors, such as occlusions due to eyelid shape, camera position or eyelashes, frequently break ellipse fitting algorithms that rely on well-defined pupil or iris edge segments. In this work, we propose training a convolutional neural network to directly segment entire elliptical structures and demonstrate that such a framework is robust to occlusions and offers superior pupil and iris tracking performance (at least 10% and 24% increase in pupil and iris center detection rate respectively within a two-pixel error margin) compared to using standard eye parts segmentation for multiple publicly available synthetic segmentation datasets.


Subject(s)
Augmented Reality , Eye-Tracking Technology , Image Processing, Computer-Assisted/methods , Virtual Reality , Algorithms , Female , Humans , Male , Smart Glasses , Video Recording
3.
J Vis ; 20(7): 13, 2020 07 01.
Article in English | MEDLINE | ID: mdl-32678878

ABSTRACT

Despite many recent advances in the field of computer vision, there remains a disconnect between how computers process images and how humans understand them. To begin to bridge this gap, we propose a framework that integrates human-elicited gaze and spoken language to label perceptually important regions in an image. Our work relies on the notion that gaze and spoken narratives can jointly model how humans inspect and analyze images. Using an unsupervised bitext alignment algorithm originally developed for machine translation, we create meaningful mappings between participants' eye movements over an image and their spoken descriptions of that image. The resulting multimodal alignments are then used to annotate image regions with linguistic labels. The accuracy of these labels exceeds that of baseline alignments obtained using purely temporal correspondence between fixations and words. We also find differences in system performances when identifying image regions using clustering methods that rely on gaze information rather than image features. The alignments produced by our framework can be used to create a database of low-level image features and high-level semantic annotations corresponding to perceptually important image regions. The framework can potentially be applied to any multimodal data stream and to any visual domain. To this end, we provide the research community with access to the computational framework.


Subject(s)
Eye Movements/physiology , Neural Networks, Computer , Speech Perception/physiology , Adolescent , Adult , Data Curation , Databases, Factual , Female , Humans , Male , Semantics , Young Adult
4.
Sci Rep ; 10(1): 2539, 2020 02 13.
Article in English | MEDLINE | ID: mdl-32054884

ABSTRACT

The study of gaze behavior has primarily been constrained to controlled environments in which the head is fixed. Consequently, little effort has been invested in the development of algorithms for the categorization of gaze events (e.g. fixations, pursuits, saccade, gaze shifts) while the head is free, and thus contributes to the velocity signals upon which classification algorithms typically operate. Our approach was to collect a novel, naturalistic, and multimodal dataset of eye + head movements when subjects performed everyday tasks while wearing a mobile eye tracker equipped with an inertial measurement unit and a 3D stereo camera. This Gaze-in-the-Wild dataset (GW) includes eye + head rotational velocities (deg/s), infrared eye images and scene imagery (RGB + D). A portion was labelled by coders into gaze motion events with a mutual agreement of 0.74 sample based Cohen's κ. This labelled data was used to train and evaluate two machine learning algorithms, Random Forest and a Recurrent Neural Network model, for gaze event classification. Assessment involved the application of established and novel event based performance metrics. Classifiers achieve ~87% human performance in detecting fixations and saccades but fall short (50%) on detecting pursuit movements. Moreover, pursuit classification is far worse in the absence of head movement information. A subsequent analysis of feature significance in our best performing model revealed that classification can be done using only the magnitudes of eye and head movements, potentially removing the need for calibration between the head and eye tracking systems. The GW dataset, trained classifiers and evaluation metrics will be made publicly available with the intention of facilitating growth in the emerging area of head-free gaze event classification.


Subject(s)
Eye Movements/physiology , Fixation, Ocular/physiology , Head Movements/physiology , Head/physiology , Activities of Daily Living , Adult , Algorithms , Female , Humans , Male , Motion , Psychomotor Performance/physiology , Pursuit, Smooth/physiology , Reflex, Vestibulo-Ocular/physiology
5.
J Eye Mov Res ; 12(6)2019 Apr 05.
Article in English | MEDLINE | ID: mdl-33828748

ABSTRACT

The inability of current video-based eye trackers to reliably detect very small eye movements has led to confusion about the prevalence or even the existence of monocular microsaccades (small, rapid eye movements that occur in only one eye at a time). As current methods often rely on precisely localizing the pupil and/or corneal reflection on successive frames, current microsaccade-detection algorithms often suffer from signal artifacts and a low signal-to-noise ratio. We describe a new video-based eye tracking methodology which can reliably detect small eye movements over 0.2 degrees (12 arcmins) with very high confidence. Our method tracks the motion of iris features to estimate velocity rather than position, yielding a better record of microsaccades. We provide a more robust, detailed record of miniature eye movements by relying on more stable, higher-order features (such as local features of iris texture) instead of lower-order features (such as pupil center and corneal reflection), which are sensitive to noise and drift.

6.
J Eye Mov Res ; 12(7)2019 Nov 25.
Article in English | MEDLINE | ID: mdl-33828764

ABSTRACT

Wearable mobile eye trackers have great potential as they allow the measurement of eye movements during daily activities such as driving, navigating the world and doing groceries. Although mobile eye trackers have been around for some time, developing and operating these eye trackers was generally a highly technical affair. As such, mobile eye-tracking research was not feasible for most labs. Nowadays, many mobile eye trackers are available from eye-tracking manufacturers (e.g. Tobii, Pupil labs, SMI, Ergoneers) and various implementations in virtual/augmented reality have recently been released.The wide availability has caused the number of publications using a mobile eye tracker to increase quickly. Mobile eye tracking is now applied in vision science, educational science, developmental psychology, marketing research (using virtual and real supermarkets), clinical psychology, usability, architecture, medicine, and more. Yet, transitioning from lab-based studies where eye trackers are fixed to the world to studies where eye trackers are fixed to the head presents researchers with a number of problems. These problems range from the conceptual frameworks used in world-fixed and head-fixed eye tracking and how they relate to each other, to the lack of data quality comparisons and field tests of the different mobile eye trackers and how the gaze signal can be classified or mapped to the visual stimulus. Such problems need to be addressed in order to understand how world-fixed and head-fixed eye-tracking research can be compared and to understand the full potential and limits of what mobile eye-tracking can deliver. In this symposium, we bring together presenting researchers from five different institutions (Lund University, Utrecht University, Clemson University, Birkbeck University of London and Rochester Institute of Technology) addressing problems and innovative solutions across the entire breadth of mobile eye-tracking research. Hooge, presenting Hessels et al. paper, focus on the definitions of fixations and saccades held by researchers in the eyemovement field and argue how they need to be clarified in order to allow comparisons between world-fixed and head-fixed eye-tracking research. - Diaz et al. introduce machine-learning techniques for classifying the gaze signal in mobile eye-tracking contexts where head and body are unrestrained. Niehorster et al. compare data quality of mobile eye trackers during natural behavior and discuss the application range of these eye trackers. Duchowski et al. introduce a method for automatically mapping gaze to faces using computer vision techniques. Pelz et al. employ state-of-the-art techniques to map fixations to objects of interest in the scene video and align grasp and eye-movement data in the same reference frame to investigate the guidance of eye movements during manual interaction. Video stream: https://vimeo.com/357473408.

7.
Forensic Sci Int ; 280: 64-80, 2017 Nov.
Article in English | MEDLINE | ID: mdl-28961443

ABSTRACT

Crime scene analysts are the core of criminal investigations; decisions made at the scene greatly affect the speed of analysis and the quality of conclusions, thereby directly impacting the successful resolution of a case. If an examiner fails to recognize the pertinence of an item on scene, the analyst's theory regarding the crime will be limited. Conversely, unselective evidence collection will most likely include irrelevant material, thus increasing a forensic laboratory's backlog and potentially sending the investigation into an unproductive and costly direction. Therefore, it is critical that analysts recognize and properly evaluate forensic evidence that can assess the relative support of differing hypotheses related to event reconstruction. With this in mind, the aim of this study was to determine if quantitative eye tracking data and qualitative reconstruction accuracy could be used to distinguish investigator expertise. In order to assess this, 32 participants were successfully recruited and categorized as experts or trained novices based on their practical experiences and educational backgrounds. Each volunteer then processed a mock crime scene while wearing a mobile eye tracker, wherein visual fixations, durations, search patterns, and reconstruction accuracy were evaluated. The eye tracking data (dwell time and task percentage on areas of interest or AOIs) were compared using Earth Mover's Distance (EMD) and the Needleman-Wunsch (N-W) algorithm, revealing significant group differences for both search duration (EMD), as well as search sequence (N-W). More specifically, experts exhibited greater dissimilarity in search duration, but greater similarity in search sequences than their novice counterparts. In addition to the quantitative visual assessment of examiner variability, each participant's reconstruction skill was assessed using a 22-point binary scoring system, in which significant group differences were detected as a function of total reconstruction accuracy. This result, coupled with the fact that the study failed to detect a significant difference between the groups when evaluating the total time needed to complete the investigation, indicates that experts are more efficient and effective. Finally, the results presented here provide a basis for continued research in the use of eye trackers to assess expertise in complex and distributed environments, including suggestions for future work, and cautions regarding the degree to which visual attention can infer cognitive understanding.


Subject(s)
Attention , Crime , Fixation, Ocular , Forensic Sciences/methods , Visual Perception , Cognition , Discriminant Analysis , Humans , Professional Competence
8.
Behav Res Methods ; 49(3): 947-959, 2017 06.
Article in English | MEDLINE | ID: mdl-27383751

ABSTRACT

The precision of an eye-tracker is critical to the correct identification of eye movements and their properties. To measure a system's precision, artificial eyes (AEs) are often used, to exclude eye movements influencing the measurements. A possible issue, however, is that it is virtually impossible to construct AEs with sufficient complexity to fully represent the human eye. To examine the consequences of this limitation, we tested currently used AEs from three manufacturers of eye-trackers and compared them to a more complex model, using 12 commercial eye-trackers. Because precision can be measured in various ways, we compared different metrics in the spatial domain and analyzed the power-spectral densities in the frequency domain. To assess how precision measurements compare in artificial and human eyes, we also measured precision using human recordings on the same eye-trackers. Our results show that the modified eye model presented can cope with all eye-trackers tested and acts as a promising candidate for further development of a set of AEs with varying pupil size and pupil-iris contrast. The spectral analysis of both the AE and human data revealed that human eye data have different frequencies that likely reflect the physiological characteristics of human eye movements. We also report the effects of sample selection methods for precision calculations. This study is part of the EMRA/COGAIN Eye Data Quality Standardization Project.


Subject(s)
Eye Movements/physiology , Eye, Artificial/standards , Humans
9.
Artif Intell Med ; 62(2): 79-90, 2014 Oct.
Article in English | MEDLINE | ID: mdl-25174882

ABSTRACT

OBJECTIVES: Extracting useful visual clues from medical images allowing accurate diagnoses requires physicians' domain knowledge acquired through years of systematic study and clinical training. This is especially true in the dermatology domain, a medical specialty that requires physicians to have image inspection experience. Automating or at least aiding such efforts requires understanding physicians' reasoning processes and their use of domain knowledge. Mining physicians' references to medical concepts in narratives during image-based diagnosis of a disease is an interesting research topic that can help reveal experts' reasoning processes. It can also be a useful resource to assist with design of information technologies for image use and for image case-based medical education systems. METHODS AND MATERIALS: We collected data for analyzing physicians' diagnostic reasoning processes by conducting an experiment that recorded their spoken descriptions during inspection of dermatology images. In this paper we focus on the benefit of physicians' spoken descriptions and provide a general workflow for mining medical domain knowledge based on linguistic data from these narratives. The challenge of a medical image case can influence the accuracy of the diagnosis as well as how physicians pursue the diagnostic process. Accordingly, we define two lexical metrics for physicians' narratives--lexical consensus score and top N relatedness score--and evaluate their usefulness by assessing the diagnostic challenge levels of corresponding medical images. We also report on clustering medical images based on anchor concepts obtained from physicians' medical term usage. These analyses are based on physicians' spoken narratives that have been preprocessed by incorporating the Unified Medical Language System for detecting medical concepts. RESULTS: The image rankings based on lexical consensus score and on top 1 relatedness score are well correlated with those based on challenge levels (Spearman correlation>0.5 and Kendall correlation>0.4). Clustering results are largely improved based on our anchor concept method (accuracy>70% and mutual information>80%). CONCLUSIONS: Physicians' spoken narratives are valuable for the purpose of mining the domain knowledge that physicians use in medical image inspections. We also show that the semantic metrics introduced in the paper can be successfully applied to medical image understanding and allow discussion of additional uses of these metrics.


Subject(s)
Data Mining , Diagnostic Imaging , Linguistics , Humans
10.
Exp Brain Res ; 217(1): 125-36, 2012 Mar.
Article in English | MEDLINE | ID: mdl-22183755

ABSTRACT

In the natural world, the brain must handle inherent delays in visual processing. This is a problem particularly during dynamic tasks. A possible solution to visuo-motor delays is prediction of a future state of the environment based on the current state and properties of the environment learned from experience. Prediction is well known to occur in both saccades and pursuit movements and is likely to depend on some kind of internal visual model as the basis for this prediction. However, most evidence comes from controlled laboratory studies using simple paradigms. In this study, we examine eye movements made in the context of demanding natural behavior, while playing squash. We show that prediction is a pervasive component of gaze behavior in this context. We show in addition that these predictive movements are extraordinarily precise and operate continuously in time across multiple trajectories and multiple movements. This suggests that prediction is based on complex dynamic visual models of the way that balls move, accumulated over extensive experience. Since eye, head, arm, and body movements all co-occur, it seems likely that a common internal model of predicted visual state is shared by different effectors to allow flexible coordination patterns. It is generally agreed that internal models are responsible for predicting future sensory state for control of body movements. The present work suggests that model-based prediction is likely to be a pervasive component in natural gaze control as well.


Subject(s)
Movement/physiology , Pursuit, Smooth/physiology , Saccades/physiology , Vision, Ocular/physiology , Adult , Humans
11.
AMIA Annu Symp Proc ; : 962, 2008 Nov 06.
Article in English | MEDLINE | ID: mdl-18999126

ABSTRACT

Clinical decision support systems (CDSS) assist physicians and other medical professionals in tasks such as differential diagnosis. End users may use different decision-making strategies depending on medical training. Study of eye movements reveals information processing strategies that are executed at a level below consciousness. Eye tracking of student physician assistants and medical residents, while using a visual diagnostic CDSS in diagnostic tasks, showed adoption of distinct strategies and informed recommendations for effective user interface design.


Subject(s)
Decision Support Systems, Clinical , Decision Support Techniques , Eye Movements , Software Design , Software , Task Performance and Analysis , User-Computer Interface , Humans , New York
12.
Perception ; 37(1): 34-49, 2008.
Article in English | MEDLINE | ID: mdl-18399246

ABSTRACT

Spatial memory is usually better for iconic than for verbal material. Our aim was to assess whether such effect is related to the way iconic and verbal targets are viewed when people have to memorize their locations. Eye movements were recorded while participants memorized the locations of images or words. Images received fewer, but longer, gazes than words. Longer gazes on images might reflect greater attention devoted to images due to their higher sensorial distinctiveness and/or generation with images of an additional phonological code beyond the visual code immediately available. We found that words were scanned mainly from left to right while a more heterogeneous scanning strategy characterized encoding of images. This suggests that iconic configurations tend to be maintained as global integrated representations in which all the item/location pairs are simultaneously present whilst verbal configurations are maintained through more sequential processes.


Subject(s)
Eye Movements/physiology , Memory/physiology , Visual Perception/physiology , Adult , Computer Graphics , Female , Humans , Male , Photography , Psychophysics , Serial Learning , Verbal Learning
13.
Am Educ Res J ; 42(4): 727-761, 2005.
Article in English | MEDLINE | ID: mdl-16628250

ABSTRACT

This study examined visual information processing and learning in classrooms including both deaf and hearing students. Of particular interest were the effects on deaf students' learning of live (three-dimensional) versus video-recorded (two-dimensional) sign language interpreting and the visual attention strategies of more and less experienced deaf signers exposed to simultaneous, multiple sources of visual information. Results from three experiments consistently indicated no differences in learning between three-dimensional and two-dimensional presentations among hearing or deaf students. Analyses of students' allocation of visual attention and the influence of various demographic and experimental variables suggested considerable flexibility in deaf students' receptive communication skills. Nevertheless, the findings also revealed a robust advantage in learning in favor of hearing students.

14.
J Vis ; 3(1): 49-63, 2003.
Article in English | MEDLINE | ID: mdl-12678625

ABSTRACT

This paper investigates the temporal dependencies of natural vision by measuring eye and hand movements while subjects made a sandwich. The phenomenon of change blindness suggests these temporal dependencies might be limited. Our observations are largely consistent with this, suggesting that much natural vision can be accomplished with "just-in-time" representations. However, we also observe several aspects of performance that point to the need for some representation of the spatial structure of the scene that is built up over different fixations. Patterns of eye-hand coordination and fixation sequences suggest the need for planning and coordinating movements over a period of a few seconds. This planning must be in a coordinate frame that is independent of eye position, and thus requires a representation of the spatial structure in a scene that is built up over different fixations.


Subject(s)
Fixation, Ocular/physiology , Hand/physiology , Memory/physiology , Psychomotor Performance/physiology , Saccades/physiology , Hand/innervation , Humans
SELECTION OF CITATIONS
SEARCH DETAIL
...