Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 3.708
Filter
1.
J Neuroeng Rehabil ; 21(1): 72, 2024 May 03.
Article in English | MEDLINE | ID: mdl-38702705

ABSTRACT

BACKGROUND: Neurodegenerative diseases, such as Parkinson's disease (PD), necessitate frequent clinical visits and monitoring to identify changes in motor symptoms and provide appropriate care. By applying machine learning techniques to video data, automated video analysis has emerged as a promising approach to track and analyze motor symptoms, which could facilitate more timely intervention. However, existing solutions often rely on specialized equipment and recording procedures, which limits their usability in unstructured settings like the home. In this study, we developed a method to detect PD symptoms from unstructured videos of clinical assessments, without the need for specialized equipment or recording procedures. METHODS: Twenty-eight individuals with Parkinson's disease completed a video-recorded motor examination that included the finger-to-nose and hand pronation-supination tasks. Clinical staff provided ground truth scores for the level of Parkinsonian symptoms present. For each video, we used a pre-existing model called PIXIE to measure the location of several joints on the person's body and quantify how they were moving. Features derived from the joint angles and trajectories, designed to be robust to recording angle, were then used to train two types of machine-learning classifiers (random forests and support vector machines) to detect the presence of PD symptoms. RESULTS: The support vector machine trained on the finger-to-nose task had an F1 score of 0.93 while the random forest trained on the same task yielded an F1 score of 0.85. The support vector machine and random forest trained on the hand pronation-supination task had F1 scores of 0.20 and 0.33, respectively. CONCLUSION: These results demonstrate the feasibility of developing video analysis tools to track motor symptoms across variable perspectives. These tools do not work equally well for all tasks, however. This technology has the potential to overcome barriers to access for many individuals with degenerative neurological diseases like PD, providing them with a more convenient and timely method to monitor symptom progression, without requiring a structured video recording procedure. Ultimately, more frequent and objective home assessments of motor function could enable more precise telehealth optimization of interventions to improve clinical outcomes inside and outside of the clinic.


Subject(s)
Machine Learning , Parkinson Disease , Video Recording , Humans , Parkinson Disease/diagnosis , Parkinson Disease/physiopathology , Male , Female , Video Recording/methods , Middle Aged , Aged , Support Vector Machine
2.
PeerJ ; 12: e17091, 2024.
Article in English | MEDLINE | ID: mdl-38708339

ABSTRACT

Monitoring the diversity and distribution of species in an ecosystem is essential to assess the success of restoration strategies. Implementing biomonitoring methods, which provide a comprehensive assessment of species diversity and mitigate biases in data collection, holds significant importance in biodiversity research. Additionally, ensuring that these methods are cost-efficient and require minimal effort is crucial for effective environmental monitoring. In this study we compare the efficiency of species detection, the cost and the effort of two non-destructive sampling techniques: Baited Remote Underwater Video (BRUV) and environmental DNA (eDNA) metabarcoding to survey marine vertebrate species. Comparisons were conducted along the Sussex coast upon the introduction of the Nearshore Trawling Byelaw. This Byelaw aims to boost the recovery of the dense kelp beds and the associated biodiversity that existed in the 1980s. We show that overall BRUV surveys are more affordable than eDNA, however, eDNA detects almost three times as many species as BRUV. eDNA and BRUV surveys are comparable in terms of effort required for each method, unless eDNA analysis is carried out externally, in which case eDNA requires less effort for the lead researchers. Furthermore, we show that increased eDNA replication yields more informative results on community structure. We found that using both methods in conjunction provides a more complete view of biodiversity, with BRUV data supplementing eDNA monitoring by recording species missed by eDNA and by providing additional environmental and life history metrics. The results from this study will serve as a baseline of the marine vertebrate community in Sussex Bay allowing future biodiversity monitoring research projects to understand community structure as the ecosystem recovers following the removal of trawling fishing pressure. Although this study was regional, the findings presented herein have relevance to marine biodiversity and conservation monitoring programs around the globe.


Subject(s)
Biodiversity , DNA, Environmental , Environmental Monitoring , DNA, Environmental/analysis , DNA, Environmental/genetics , Animals , Environmental Monitoring/methods , Aquatic Organisms/genetics , Video Recording/methods , Ecosystem , DNA Barcoding, Taxonomic/methods
3.
BMJ Open Qual ; 13(2)2024 May 15.
Article in English | MEDLINE | ID: mdl-38749540

ABSTRACT

Video review (VR) of procedures in the medical environment can be used to drive quality improvement. However, first it has to be implemented in a safe and effective way. Our primary objective was to (re)define a guideline for implementing interprofessional VR in a neonatal intensive care unit (NICU). Our secondary objective was to determine the rate of acceptance by providers attending VR. For 9 months, VR sessions were evaluated with a study group, consisting of different stakeholders. A questionnaire was embedded at the end of each session to obtain feedback from providers on the session and on the safe learning environment. In consensus meetings, success factors and preconditions were identified and divided into different factors that influenced the rate of adoption of VR. The number of providers who recorded procedures and attended VR sessions was determined. A total of 18 VR sessions could be organised, with an equal distribution of medical and nursing staff. After the 9-month period, 101/125 (81%) of all providers working on the NICU attended at least 1 session and 80/125 (64%) of all providers recorded their performance of a procedure at least 1 time. In total, 179/297 (61%) providers completed the questionnaire. Almost all providers (99%) reported to have a positive opinion about the review sessions. Preconditions and success factors related to implementation were identified and addressed, including improving the pathway for obtaining consent, preparation of VR, defining the role of the chair during the session and building a safe learning environment. Different strategies were developed to ensure findings from sessions were used for quality improvement. VR was successfully implemented on our NICU and we redefined our guideline with various preconditions and success factors. The adjusted guideline can be helpful for implementation of VR in emergency care settings.


Subject(s)
Intensive Care Units, Neonatal , Quality Improvement , Video Recording , Humans , Intensive Care Units, Neonatal/organization & administration , Intensive Care Units, Neonatal/statistics & numerical data , Intensive Care Units, Neonatal/standards , Surveys and Questionnaires , Infant, Newborn , Video Recording/methods , Video Recording/statistics & numerical data , Health Services Research/methods
4.
Sci Rep ; 14(1): 11774, 2024 05 23.
Article in English | MEDLINE | ID: mdl-38783018

ABSTRACT

To develop and assess a deep learning (DL) pipeline to learn dynamic MR image reconstruction from publicly available natural videos (Inter4K). Learning was performed for a range of DL architectures (VarNet, 3D UNet, FastDVDNet) and corresponding sampling patterns (Cartesian, radial, spiral) either from true multi-coil cardiac MR data (N = 692) or from synthetic MR data simulated from Inter4K natural videos (N = 588). Real-time undersampled dynamic MR images were reconstructed using DL networks trained with cardiac data and natural videos, and compressed sensing (CS). Differences were assessed in simulations (N = 104 datasets) in terms of MSE, PSNR, and SSIM and prospectively for cardiac cine (short axis, four chambers, N = 20) and speech cine (N = 10) data in terms of subjective image quality ranking, SNR and Edge sharpness. Friedman Chi Square tests with post-hoc Nemenyi analysis were performed to assess statistical significance. In simulated data, DL networks trained with cardiac data outperformed DL networks trained with natural videos, both of which outperformed CS (p < 0.05). However, in prospective experiments DL reconstructions using both training datasets were ranked similarly (and higher than CS) and presented no statistical differences in SNR and Edge Sharpness for most conditions.The developed pipeline enabled learning dynamic MR reconstruction from natural videos preserving DL reconstruction advantages such as high quality fast and ultra-fast reconstructions while overcoming some limitations (data scarcity or sharing). The natural video dataset, code and pre-trained networks are made readily available on github.


Subject(s)
Deep Learning , Image Processing, Computer-Assisted , Magnetic Resonance Imaging , Humans , Image Processing, Computer-Assisted/methods , Magnetic Resonance Imaging/methods , Heart/diagnostic imaging , Video Recording/methods , Magnetic Resonance Imaging, Cine/methods
6.
Sensors (Basel) ; 24(9)2024 Apr 23.
Article in English | MEDLINE | ID: mdl-38732772

ABSTRACT

In mobile eye-tracking research, the automatic annotation of fixation points is an important yet difficult task, especially in varied and dynamic environments such as outdoor urban landscapes. This complexity is increased by the constant movement and dynamic nature of both the observer and their environment in urban spaces. This paper presents a novel approach that integrates the capabilities of two foundation models, YOLOv8 and Mask2Former, as a pipeline to automatically annotate fixation points without requiring additional training or fine-tuning. Our pipeline leverages YOLO's extensive training on the MS COCO dataset for object detection and Mask2Former's training on the Cityscapes dataset for semantic segmentation. This integration not only streamlines the annotation process but also improves accuracy and consistency, ensuring reliable annotations, even in complex scenes with multiple objects side by side or at different depths. Validation through two experiments showcases its efficiency, achieving 89.05% accuracy in a controlled data collection and 81.50% accuracy in a real-world outdoor wayfinding scenario. With an average runtime per frame of 1.61 ± 0.35 s, our approach stands as a robust solution for automatic fixation annotation.


Subject(s)
Eye-Tracking Technology , Fixation, Ocular , Humans , Fixation, Ocular/physiology , Video Recording/methods , Algorithms , Eye Movements/physiology
7.
Sci Rep ; 14(1): 12432, 2024 05 30.
Article in English | MEDLINE | ID: mdl-38816459

ABSTRACT

The advent of Artificial Intelligence (AI)-based object detection technology has made identification of position coordinates of surgical instruments from videos possible. This study aimed to find kinematic differences by surgical skill level. An AI algorithm was developed to identify X and Y coordinates of surgical instrument tips accurately from video. Kinematic analysis including fluctuation analysis was performed on 18 laparoscopic distal gastrectomy videos from three expert and three novice surgeons (3 videos/surgeon, 11.6 h, 1,254,010 frames). Analysis showed the expert surgeon cohort moved more efficiently and regularly, with significantly less operation time and total travel distance. Instrument tip movement did not differ in velocity, acceleration, or jerk between skill levels. The evaluation index of fluctuation ß was significantly higher in experts. ROC curve cutoff value at 1.4 determined sensitivity and specificity of 77.8% for experts and novices. Despite the small sample, this study suggests AI-based object detection with fluctuation analysis is promising because skill evaluation can be calculated in real time with potential for peri-operational evaluation.


Subject(s)
Artificial Intelligence , Clinical Competence , Gastrectomy , Laparoscopy , Laparoscopy/methods , Humans , Gastrectomy/methods , Video Recording/methods , Male , Female , Algorithms , Biomechanical Phenomena , ROC Curve
8.
BMJ Open Qual ; 13(2)2024 May 21.
Article in English | MEDLINE | ID: mdl-38772882

ABSTRACT

BackgroundAn evaluation report for a pilot project on the use of video in medical emergency calls between the caller and medical operator indicates that video is only used in 4% of phone calls to the emergency medical communication centre (EMCC). Furthermore, the report found that in half of these cases, the use of video did not alter the assessment made by the medical operator at the EMCC.We aimed to describe the reasons for when and why medical operators choose to use or not use video in emergency calls. METHOD: The study was conducted in a Norwegian EMCC, employing a thematic analysis of notes from medical operators responding to emergency calls regarding the use of video. RESULT: Informants reported 19 cases where video was used and 46 cases where it was not used. When video was used, three main themes appeared: 'unclear situation or patient condition', 'visible problem' and 'children'. When video was not used the following themes emerged: 'cannot be executed/technical problems', 'does not follow instructions', 'perceived as unnecessary'. Video was mostly used in cases where the medical operators were uncertain about the situation or the patients' conditions. CONCLUSION: The results indicate that medical operators were selective in choosing when to use video. In cases where operators employed video, it provided a better understanding of the situation, potentially enhancing the basis for decision-making.


Subject(s)
Video Recording , Humans , Norway , Video Recording/methods , Video Recording/statistics & numerical data , Male , Female , Pilot Projects , Adult , Emergency Medical Services/methods , Emergency Medical Services/statistics & numerical data , Emergency Medical Services/standards , Qualitative Research
9.
IEEE Trans Image Process ; 33: 3256-3270, 2024.
Article in English | MEDLINE | ID: mdl-38696298

ABSTRACT

Video-based referring expression comprehension is a challenging task that requires locating the referred object in each video frame of a given video. While many existing approaches treat this task as an object-tracking problem, their performance is heavily reliant on the quality of the tracking templates. Furthermore, when there is not enough annotation data to assist in template selection, the tracking may fail. Other approaches are based on object detection, but they often use only one adjacent frame of the key frame for feature learning, which limits their ability to establish the relationship between different frames. In addition, improving the fusion of features from multiple frames and referring expressions to effectively locate the referents remains an open problem. To address these issues, we propose a novel approach called the Multi-Stage Image-Language Cross-Generative Fusion Network (MILCGF-Net), which is based on one-stage object detection. Our approach includes a Frame Dense Feature Aggregation module for dense feature learning of adjacent time sequences. Additionally, we propose an Image-Language Cross-Generative Fusion module as the main body of multi-stage learning to generate cross-modal features by calculating the similarity between video and expression, and then refining and fusing the generated features. To further enhance the cross-modal feature generation capability of our model, we introduce a consistency loss that constrains the image-language similarity and language-image similarity matrices during feature generation. We evaluate our proposed approach on three public datasets and demonstrate its effectiveness through comprehensive experimental results.


Subject(s)
Algorithms , Image Processing, Computer-Assisted , Video Recording , Video Recording/methods , Image Processing, Computer-Assisted/methods , Humans
10.
Sensors (Basel) ; 24(10)2024 May 16.
Article in English | MEDLINE | ID: mdl-38794023

ABSTRACT

Accelerometers worn by animals produce distinct behavioral signatures, which can be classified accurately using machine learning methods such as random forest decision trees. The objective of this study was to identify accelerometer signal separation among parsimonious behaviors. We achieved this objective by (1) describing functional differences in accelerometer signals among discrete behaviors, (2) identifying the optimal window size for signal pre-processing, and (3) demonstrating the number of observations required to achieve the desired level of model accuracy,. Crossbred steers (Bos taurus indicus; n = 10) were fitted with GPS collars containing a video camera and tri-axial accelerometers (read-rate = 40 Hz). Distinct behaviors from accelerometer signals, particularly for grazing, were apparent because of the head-down posture. Increasing the smoothing window size to 10 s improved classification accuracy (p < 0.05), but reducing the number of observations below 50% resulted in a decrease in accuracy for all behaviors (p < 0.05). In-pasture observation increased accuracy and precision (0.05 and 0.08 percent, respectively) compared with animal-borne collar video observations.


Subject(s)
Accelerometry , Behavior, Animal , Machine Learning , Animals , Cattle , Accelerometry/methods , Behavior, Animal/physiology , Video Recording/methods , Male , Signal Processing, Computer-Assisted
11.
Sensors (Basel) ; 24(8)2024 Apr 12.
Article in English | MEDLINE | ID: mdl-38676108

ABSTRACT

Egocentric activity recognition is a prominent computer vision task that is based on the use of wearable cameras. Since egocentric videos are captured through the perspective of the person wearing the camera, her/his body motions severely complicate the video content, imposing several challenges. In this work we propose a novel approach for domain-generalized egocentric human activity recognition. Typical approaches use a large amount of training data, aiming to cover all possible variants of each action. Moreover, several recent approaches have attempted to handle discrepancies between domains with a variety of costly and mostly unsupervised domain adaptation methods. In our approach we show that through simple manipulation of available source domain data and with minor involvement from the target domain, we are able to produce robust models, able to adequately predict human activity in egocentric video sequences. To this end, we introduce a novel three-stream deep neural network architecture combining elements of vision transformers and residual neural networks which are trained using multi-modal data. We evaluate the proposed approach using a challenging, egocentric video dataset and demonstrate its superiority over recent, state-of-the-art research works.


Subject(s)
Neural Networks, Computer , Video Recording , Humans , Video Recording/methods , Algorithms , Pattern Recognition, Automated/methods , Image Processing, Computer-Assisted/methods , Human Activities , Wearable Electronic Devices
12.
Sensors (Basel) ; 24(8)2024 Apr 19.
Article in English | MEDLINE | ID: mdl-38676235

ABSTRACT

Most human emotion recognition methods largely depend on classifying stereotypical facial expressions that represent emotions. However, such facial expressions do not necessarily correspond to actual emotional states and may correspond to communicative intentions. In other cases, emotions are hidden, cannot be expressed, or may have lower arousal manifested by less pronounced facial expressions, as may occur during passive video viewing. This study improves an emotion classification approach developed in a previous study, which classifies emotions remotely without relying on stereotypical facial expressions or contact-based methods, using short facial video data. In this approach, we desire to remotely sense transdermal cardiovascular spatiotemporal facial patterns associated with different emotional states and analyze this data via machine learning. In this paper, we propose several improvements, which include a better remote heart rate estimation via a preliminary skin segmentation, improvement of the heartbeat peaks and troughs detection process, and obtaining a better emotion classification accuracy by employing an appropriate deep learning classifier using an RGB camera input only with data. We used the dataset obtained in the previous study, which contains facial videos of 110 participants who passively viewed 150 short videos that elicited the following five emotion types: amusement, disgust, fear, sexual arousal, and no emotion, while three cameras with different wavelength sensitivities (visible spectrum, near-infrared, and longwave infrared) recorded them simultaneously. From the short facial videos, we extracted unique high-resolution spatiotemporal, physiologically affected features and examined them as input features with different deep-learning approaches. An EfficientNet-B0 model type was able to classify participants' emotional states with an overall average accuracy of 47.36% using a single input spatiotemporal feature map obtained from a regular RGB camera.


Subject(s)
Deep Learning , Emotions , Facial Expression , Heart Rate , Humans , Emotions/physiology , Heart Rate/physiology , Video Recording/methods , Image Processing, Computer-Assisted/methods , Face/physiology , Female , Male
13.
Sci Rep ; 14(1): 9481, 2024 04 25.
Article in English | MEDLINE | ID: mdl-38664466

ABSTRACT

In demersal trawl fisheries, the unavailability of the catch information until the end of the catching process is a drawback, leading to seabed impacts, bycatches and reducing the economic performance of the fisheries. The emergence of in-trawl cameras to observe catches in real-time can provide such information. This data needs to be processed in real-time to determine the catch compositions and rates, eventually improving sustainability and economic performance of the fisheries. In this study, a real-time underwater video processing system counting the Nephrops individuals entering the trawl has been developed using object detection and tracking methods on an edge device (NVIDIA Jetson AGX Orin). Seven state-of-the-art YOLO models were tested to discover the appropriate training settings and YOLO model. To achieve real-time processing and accurate counting simultaneously, four frame skipping ideas were evaluated. It has been shown that adaptive frame skipping approach, together with YOLOv8s model, can increase the processing speed up to 97.47 FPS while achieving correct count rate and F-score of 82.57% and 0.86, respectively. In conclusion, this system can improve the sustainability of the Nephrops directed trawl fishery by providing catch information in real-time.


Subject(s)
Fisheries , Animals , Video Recording/methods , Fishes/physiology , Image Processing, Computer-Assisted/methods , Algorithms , Models, Theoretical
14.
Neural Netw ; 175: 106319, 2024 Jul.
Article in English | MEDLINE | ID: mdl-38640698

ABSTRACT

To enhance deep learning-based automated interictal epileptiform discharge (IED) detection, this study proposes a multimodal method, vEpiNet, that leverages video and electroencephalogram (EEG) data. Datasets comprise 24 931 IED (from 484 patients) and 166 094 non-IED 4-second video-EEG segments. The video data is processed by the proposed patient detection method, with frame difference and Simple Keypoints (SKPS) capturing patients' movements. EEG data is processed with EfficientNetV2. The video and EEG features are fused via a multilayer perceptron. We developed a comparative model, termed nEpiNet, to test the effectiveness of the video feature in vEpiNet. The 10-fold cross-validation was used for testing. The 10-fold cross-validation showed high areas under the receiver operating characteristic curve (AUROC) in both models, with a slightly superior AUROC (0.9902) in vEpiNet compared to nEpiNet (0.9878). Moreover, to test the model performance in real-world scenarios, we set a prospective test dataset, containing 215 h of raw video-EEG data from 50 patients. The result shows that the vEpiNet achieves an area under the precision-recall curve (AUPRC) of 0.8623, surpassing nEpiNet's 0.8316. Incorporating video data raises precision from 70% (95% CI, 69.8%-70.2%) to 76.6% (95% CI, 74.9%-78.2%) at 80% sensitivity and reduces false positives by nearly a third, with vEpiNet processing one-hour video-EEG data in 5.7 min on average. Our findings indicate that video data can significantly improve the performance and precision of IED detection, especially in prospective real clinic testing. It suggests that vEpiNet is a clinically viable and effective tool for IED analysis in real-world applications.


Subject(s)
Deep Learning , Electroencephalography , Epilepsy , Video Recording , Humans , Electroencephalography/methods , Video Recording/methods , Epilepsy/diagnosis , Epilepsy/physiopathology , Male , Female , Adult , Middle Aged , Adolescent , Neural Networks, Computer , Young Adult , Child
15.
J Biomech ; 168: 112078, 2024 May.
Article in English | MEDLINE | ID: mdl-38663110

ABSTRACT

This study explored the potential of reconstructing the 3D motion of a swimmer's hands with accuracy and consistency using action sport cameras (ASC) distributed in-air and underwater. To record at least two stroke cycles of an athlete performing a front crawl task, the cameras were properly calibrated to cover an acquisition volume of 3 m in X, 8 m in Y, and 3.5 m in Z axis, approximately. Camera calibration was attained by applying bundle adjustment in both environments. A testing wand, carrying two markers, was acquired to evaluate the three-dimensional (3D) reconstruction accuracy in-air, underwater, and over the water transition. The global 3D accuracy (mean absolute error) was less than 1.5 mm. The standard error of measurement and the coefficient of variation were smaller than 1 mm and 1%, respectively, revealing that the camera calibration procedure was highly repeatable. No significant correlation between the error magnitude (percentage error during the test and the retest sessions: 1.2 to 0.8%) and the transition from in-air to underwater was observed. The feasibility of the hand motion reconstruction was demonstrated by recording five swimmers during the front crawl stroke, in three different tasks performed at increasing efforts. Intra-class correlation confirmed the optimal agreement (ICC>0.90) among repeated stroke cycles of the same swimmer, irrespective of task effort. Skewness, close to 0, and kurtosis, close to 3.5, supported the hypothesis of negligible effects of the calibration and tracking errors on the motion and speed patterns. In conclusion, we may argue that ASCs, equipped with a robust bundle adjustment camera calibration technique, ensure reliable reconstruction of swimming motion in in-air and underwater large volumes.


Subject(s)
Swimming , Humans , Swimming/physiology , Biomechanical Phenomena , Male , Imaging, Three-Dimensional/methods , Feasibility Studies , Video Recording/methods , Hand/physiology , Reproducibility of Results , Female , Calibration , Young Adult
16.
Ann Neurol ; 95(6): 1138-1148, 2024 Jun.
Article in English | MEDLINE | ID: mdl-38624073

ABSTRACT

OBJECTIVE: The objective was to analyze seizure semiology in pediatric frontal lobe epilepsy patients, considering age, to localize the seizure onset zone for surgical resection in focal epilepsy. METHODS: Fifty patients were identified retrospectively, who achieved seizure freedom after frontal lobe resective surgery at Great Ormond Street Hospital. Video-electroencephalography recordings of preoperative ictal seizure semiology were analyzed, stratifying the data based on resection region (mesial or lateral frontal lobe) and age at surgery (≤4 vs >4). RESULTS: Pediatric frontal lobe epilepsy is characterized by frequent, short, complex seizures, similar to adult cohorts. Children with mesial onset had higher occurrence of head deviation (either direction: 55.6% vs 17.4%; p = 0.02) and contralateral head deviation (22.2% vs 0.0%; p = 0.03), ictal body-turning (55.6% vs 13.0%; p = 0.006; ipsilateral: 55.6% vs 4.3%; p = 0.0003), and complex motor signs (88.9% vs 56.5%; p = 0.037). Both age groups (≤4 and >4 years) showed hyperkinetic features (21.1% vs 32.1%), contrary to previous reports. The very young group showed more myoclonic (36.8% vs 3.6%; p = 0.005) and hypomotor features (31.6% vs 0.0%; p = 0.003), and fewer behavioral features (36.8% vs 71.4%; p = 0.03) and reduced responsiveness (31.6% vs 78.6%; p = 0.002). INTERPRETATION: This study presents the most extensive semiological analysis of children with confirmed frontal lobe epilepsy. It identifies semiological features that aid in differentiating between mesial and lateral onset. Despite age-dependent differences, typical frontal lobe features, including hyperkinetic seizures, are observed even in very young children. A better understanding of pediatric seizure semiology may enhance the accuracy of onset identification, and enable earlier presurgical evaluation, improving postsurgical outcomes. ANN NEUROL 2024;95:1138-1148.


Subject(s)
Electroencephalography , Epilepsy, Frontal Lobe , Seizures , Humans , Child , Male , Female , Epilepsy, Frontal Lobe/surgery , Epilepsy, Frontal Lobe/physiopathology , Epilepsy, Frontal Lobe/diagnosis , Child, Preschool , Electroencephalography/methods , Retrospective Studies , Adolescent , Seizures/physiopathology , Seizures/surgery , Seizures/diagnosis , Infant , Frontal Lobe/physiopathology , Video Recording/methods
17.
Epilepsia ; 65(5): 1176-1202, 2024 May.
Article in English | MEDLINE | ID: mdl-38426252

ABSTRACT

Computer vision (CV) shows increasing promise as an efficient, low-cost tool for video seizure detection and classification. Here, we provide an overview of the fundamental concepts needed to understand CV and summarize the structure and performance of various model architectures used in video seizure analysis. We conduct a systematic literature review of the PubMed, Embase, and Web of Science databases from January 1, 2000 to September 15, 2023, to identify the strengths and limitations of CV seizure analysis methods and discuss the utility of these models when applied to different clinical seizure phenotypes. Reviews, nonhuman studies, and those with insufficient or poor quality data are excluded from the review. Of the 1942 records identified, 45 meet inclusion criteria and are analyzed. We conclude that the field has shown tremendous growth over the past 2 decades, leading to several model architectures with impressive accuracy and efficiency. The rapid and scalable detection offered by CV models holds the potential to reduce sudden unexpected death in epilepsy and help alleviate resource limitations in epilepsy monitoring units. However, a lack of standardized, thorough validation measures and concerns about patient privacy remain important obstacles for widespread acceptance and adoption. Investigation into the performance of models across varied datasets from clinical and nonclinical environments is an essential area for further research.


Subject(s)
Seizures , Humans , Seizures/diagnosis , Seizures/classification , Electroencephalography/methods , Video Recording/methods
18.
IEEE J Biomed Health Inform ; 28(5): 3015-3028, 2024 May.
Article in English | MEDLINE | ID: mdl-38446652

ABSTRACT

The infant sleep-wake behavior is an essential indicator of physiological and neurological system maturity, the circadian transition of which is important for evaluating the recovery of preterm infants from inadequate physiological function and cognitive disorders. Recently, camera-based infant sleep-wake monitoring has been investigated, but the challenges of generalization caused by variance in infants and clinical environments are not addressed for this application. In this paper, we conducted a multi-center clinical trial at four hospitals to improve the generalization of camera-based infant sleep-wake monitoring. Using the face videos of 64 term and 39 preterm infants recorded in NICUs, we proposed a novel sleep-wake classification strategy, called consistent deep representation constraint (CDRC), that forces the convolutional neural network (CNN) to make consistent predictions for the samples from different conditions but with the same label, to address the variances caused by infants and environments. The clinical validation shows that by using CDRC, all CNN backbones obtain over 85% accuracy, sensitivity, and specificity in both the cross-age and cross-environment experiments, improving the ones without CDRC by almost 15% in all metrics. This demonstrates that by improving the consistency of the deep representation of samples with the same state, we can significantly improve the generalization of infant sleep-wake classification.


Subject(s)
Intensive Care Units, Neonatal , Sleep , Video Recording , Humans , Infant, Newborn , Video Recording/methods , Sleep/physiology , Monitoring, Physiologic/methods , Male , Female , Infant, Premature/physiology , Neural Networks, Computer , Wakefulness/physiology , Infant , Image Processing, Computer-Assisted/methods
19.
Epilepsy Behav ; 154: 109735, 2024 May.
Article in English | MEDLINE | ID: mdl-38522192

ABSTRACT

Seizure events can manifest as transient disruptions in the control of movements which may be organized in distinct behavioral sequences, accompanied or not by other observable features such as altered facial expressions. The analysis of these clinical signs, referred to as semiology, is subject to observer variations when specialists evaluate video-recorded events in the clinical setting. To enhance the accuracy and consistency of evaluations, computer-aided video analysis of seizures has emerged as a natural avenue. In the field of medical applications, deep learning and computer vision approaches have driven substantial advancements. Historically, these approaches have been used for disease detection, classification, and prediction using diagnostic data; however, there has been limited exploration of their application in evaluating video-based motion detection in the clinical epileptology setting. While vision-based technologies do not aim to replace clinical expertise, they can significantly contribute to medical decision-making and patient care by providing quantitative evidence and decision support. Behavior monitoring tools offer several advantages such as providing objective information, detecting challenging-to-observe events, reducing documentation efforts, and extending assessment capabilities to areas with limited expertise. The main applications of these could be (1) improved seizure detection methods; (2) refined semiology analysis for predicting seizure type and cerebral localization. In this paper, we detail the foundation technologies used in vision-based systems in the analysis of seizure videos, highlighting their success in semiology detection and analysis, focusing on work published in the last 7 years. We systematically present these methods and indicate how the adoption of deep learning for the analysis of video recordings of seizures could be approached. Additionally, we illustrate how existing technologies can be interconnected through an integrated system for video-based semiology analysis. Each module can be customized and improved by adapting more accurate and robust deep learning approaches as these evolve. Finally, we discuss challenges and research directions for future studies.


Subject(s)
Deep Learning , Seizures , Video Recording , Humans , Seizures/diagnosis , Seizures/physiopathology , Video Recording/methods , Electroencephalography/methods
20.
Behav Res Methods ; 56(4): 3300-3314, 2024 Apr.
Article in English | MEDLINE | ID: mdl-38448726

ABSTRACT

Eye movements offer valuable insights for clinical interventions, diagnostics, and understanding visual perception. The process usually involves recording a participant's eye movements and analyzing them in terms of various gaze events. Manual identification of these events is extremely time-consuming. Although the field has seen the development of automatic event detection and classification methods, these methods have primarily focused on distinguishing events when participants remain stationary. With increasing interest in studying gaze behavior in freely moving participants, such as during daily activities like walking, new methods are required to automatically classify events in data collected under unrestricted conditions. Existing methods often rely on additional information from depth cameras or inertial measurement units (IMUs), which are not typically integrated into mobile eye trackers. To address this challenge, we present a framework for classifying gaze events based solely on eye-movement signals and scene video footage. Our approach, the Automatic Classification of gaze Events in Dynamic and Natural Viewing (ACE-DNV), analyzes eye movements in terms of velocity and direction and leverages visual odometry to capture head and body motion. Additionally, ACE-DNV assesses changes in image content surrounding the point of gaze. We evaluate the performance of ACE-DNV using a publicly available dataset and showcased its ability to discriminate between gaze fixation, gaze pursuit, gaze following, and gaze shifting (saccade) events. ACE-DNV exhibited comparable performance to previous methods, while eliminating the necessity for additional devices such as IMUs and depth cameras. In summary, ACE-DNV simplifies the automatic classification of gaze events in natural and dynamic environments. The source code is accessible at https://github.com/arnejad/ACE-DNV .


Subject(s)
Eye Movements , Eye-Tracking Technology , Fixation, Ocular , Humans , Eye Movements/physiology , Fixation, Ocular/physiology , Visual Perception/physiology , Video Recording/methods , Male , Adult , Female
SELECTION OF CITATIONS
SEARCH DETAIL
...