Search | VHL Regional Portal

Forecasting People Trajectories and Head Poses by Jointly Reasoning on Tracklets and Vislets.

Hasan, Irtiza; Setti, Francesco; Tsesmelis, Theodore; Belagiannis, Vasileios; Amin, Sikandar; Del Bue, Alessio; Cristani, Marco; Galasso, Fabio.

IEEE Trans Pattern Anal Mach Intell ; 43(4): 1267-1278, 2021 04.

Article in English | MEDLINE | ID: mdl-31670663

ABSTRACT

In this article, we explore the correlation between people trajectories and their head orientations. We argue that people trajectory and head pose forecasting can be modelled as a joint problem. Recent approaches on trajectory forecasting leverage short-term trajectories (aka tracklets) of pedestrians to predict their future paths. In addition, sociological cues, such as expected destination or pedestrian interaction, are often combined with tracklets. In this article, we propose MiXing-LSTM (MX-LSTM) to capture the interplay between positions and head orientations (vislets) thanks to a joint unconstrained optimization of full covariance matrices during the LSTM backpropagation. We additionally exploit the head orientations as a proxy for the visual attention, when modeling social interactions. MX-LSTM predicts future pedestrians location and head pose, increasing the standard capabilities of the current approaches on long-term trajectory forecasting. Compared to the state-of-the-art, our approach shows better performances on an extensive set of public benchmarks. MX-LSTM is particularly effective when people move slowly, i.e., the most challenging scenario for all other models. The proposed approach also allows for accurate predictions on a longer time horizon.

Real-time localization of articulated surgical instruments in retinal microsurgery.

Rieke, Nicola; Tan, David Joseph; Amat di San Filippo, Chiara; Tombari, Federico; Alsheakhali, Mohamed; Belagiannis, Vasileios; Eslami, Abouzar; Navab, Nassir.

Med Image Anal ; 34: 82-100, 2016 12.

Article in English | MEDLINE | ID: mdl-27237604

ABSTRACT

Real-time visual tracking of a surgical instrument holds great potential for improving the outcome of retinal microsurgery by enabling new possibilities for computer-aided techniques such as augmented reality and automatic assessment of instrument manipulation. Due to high magnification and illumination variations, retinal microsurgery images usually entail a high level of noise and appearance changes. As a result, real-time tracking of the surgical instrument remains challenging in in-vivo sequences. To overcome these problems, we present a method that builds on random forests and addresses the task by modelling the instrument as an articulated object. A multi-template tracker reduces the region of interest to a rectangular area around the instrument tip by relating the movement of the instrument to the induced changes on the image intensities. Within this bounding box, a gradient-based pose estimation infers the location of the instrument parts from image features. In this way, the algorithm does not only provide the location of instrument, but also the positions of the tool tips in real-time. Various experiments on a novel dataset comprising 18 in-vivo retinal microsurgery sequences demonstrate the robustness and generalizability of our method. The comparison on two publicly available datasets indicates that the algorithm can outperform current state-of-the art.

Subject(s)

Algorithms , Microsurgery/methods , Retina/surgery , Surgery, Computer-Assisted/methods , Surgical Instruments , Humans

AggNet: Deep Learning From Crowds for Mitosis Detection in Breast Cancer Histology Images.

Albarqouni, Shadi; Baur, Christoph; Achilles, Felix; Belagiannis, Vasileios; Demirci, Stefanie; Navab, Nassir.

IEEE Trans Med Imaging ; 35(5): 1313-21, 2016 05.

Article in English | MEDLINE | ID: mdl-26891484

ABSTRACT

The lack of publicly available ground-truth data has been identified as the major challenge for transferring recent developments in deep learning to the biomedical imaging domain. Though crowdsourcing has enabled annotation of large scale databases for real world images, its application for biomedical purposes requires a deeper understanding and hence, more precise definition of the actual annotation task. The fact that expert tasks are being outsourced to non-expert users may lead to noisy annotations introducing disagreement between users. Despite being a valuable resource for learning annotation models from crowdsourcing, conventional machine-learning methods may have difficulties dealing with noisy annotations during training. In this manuscript, we present a new concept for learning from crowds that handle data aggregation directly as part of the learning process of the convolutional neural network (CNN) via additional crowdsourcing layer (AggNet). Besides, we present an experimental study on learning from crowds designed to answer the following questions. 1) Can deep CNN be trained with data collected from crowdsourcing? 2) How to adapt the CNN to train on multiple types of annotation datasets (ground truth and crowd-based)? 3) How does the choice of annotation and aggregation affect the accuracy? Our experimental setup involved Annot8, a self-implemented web-platform based on Crowdflower API realizing image annotation tasks for a publicly available biomedical image database. Our results give valuable insights into the functionality of deep CNN learning from crowd annotations and prove the necessity of data aggregation integration.

Subject(s)

Breast Neoplasms/diagnostic imaging , Crowdsourcing/methods , Histocytochemistry , Image Interpretation, Computer-Assisted/methods , Mitosis/physiology , Neural Networks, Computer , Female , Humans , Internet , Machine Learning , Video Games

Automatic 3D reconstruction of electrophysiology catheters from two-view monoplane C-arm image sequences.

Baur, Christoph; Milletari, Fausto; Belagiannis, Vasileios; Navab, Nassir; Fallavollita, Pascal.

Int J Comput Assist Radiol Surg ; 11(7): 1319-28, 2016 Jul.

Article in English | MEDLINE | ID: mdl-26615429

ABSTRACT

PURPOSE: Catheter guidance is a vital task for the success of electrophysiology interventions. It is usually provided through fluoroscopic images that are taken intra-operatively. The cardiologists, who are typically equipped with C-arm systems, scan the patient from multiple views rotating the fluoroscope around one of its axes. The resulting sequences allow the cardiologists to build a mental model of the 3D position of the catheters and interest points from the multiple views. METHOD: We describe and compare different 3D catheter reconstruction strategies and ultimately propose a novel and robust method for the automatic reconstruction of 3D catheters in non-synchronized fluoroscopic sequences. This approach does not purely rely on triangulation but incorporates prior knowledge about the catheters. In conjunction with an automatic detection method, we demonstrate the performance of our method compared to ground truth annotations. RESULTS: In our experiments that include 20 biplane datasets, we achieve an average reprojection error of 0.43 mm and an average reconstruction error of 0.67 mm compared to gold standard annotation. CONCLUSIONS: In clinical practice, catheters suffer from complex motion due to the combined effect of heartbeat and respiratory motion. As a result, any 3D reconstruction algorithm via triangulation is imprecise. We have proposed a new method that is fully automatic and highly accurate to reconstruct catheters in three dimensions.

Subject(s)

Algorithms , Arrhythmias, Cardiac/diagnosis , Catheters , Electrophysiologic Techniques, Cardiac/methods , Fluoroscopy/methods , Heart/diagnostic imaging , Imaging, Three-Dimensional/methods , Electrophysiology , Humans , Motion

3D Pictorial Structures Revisited: Multiple Human Pose Estimation.

Belagiannis, Vasileios; Amin, Sikandar; Andriluka, Mykhaylo; Schiele, Bernt; Navab, Nassir; Ilic, Slobodan.

IEEE Trans Pattern Anal Mach Intell ; 38(10): 1929-42, 2016 10.

Article in English | MEDLINE | ID: mdl-26700970

ABSTRACT

We address the problem of 3D pose estimation of multiple humans from multiple views. The transition from single to multiple human pose estimation and from the 2D to 3D space is challenging due to a much larger state space, occlusions and across-view ambiguities when not knowing the identity of the humans in advance. To address these problems, we first create a reduced state space by triangulation of corresponding pairs of body parts obtained by part detectors for each camera view. In order to resolve ambiguities of wrong and mixed parts of multiple humans after triangulation and also those coming from false positive detections, we introduce a 3D pictorial structures (3DPS) model. Our model builds on multi-view unary potentials, while a prior model is integrated into pairwise and ternary potential functions. To balance the potentials' influence, the model parameters are learnt using a Structured SVM (SSVM). The model is generic and applicable to both single and multiple human pose estimation. To evaluate our model on single and multiple human pose estimation, we rely on four different datasets. We first analyse the contribution of the potentials and then compare our results with related work where we demonstrate superior performance.

Subject(s)

Algorithms , Imaging, Three-Dimensional , Posture , Humans

Fully automatic catheter localization in C-arm images using l1-sparse coding.

Milletari, Fausto; Belagiannis, Vasileios; Navab, Nassir; Fallavollita, Pascal.

Med Image Comput Comput Assist Interv ; 17(Pt 2): 570-7, 2014.

Article in English | MEDLINE | ID: mdl-25485425

ABSTRACT

We propose a method to perform automatic detection and tracking of electrophysiology (EP) catheters in C-arm fluoroscopy sequences. Our approach does not require any initialization, is completely automatic, and can concurrently track an arbitrary number of overlapping catheters. After a pre-processing step, we employ sparse coding to first detect candidate catheter tips, and subsequently detect and track the catheters. The proposed technique is validated on 2835 C-arm images, which include 39,690 manually selected ground-truth catheter electrodes. Results demonstrated sub-millimeter detection accuracy and real-time tracking performances.

Subject(s)

Algorithms , Cardiac Catheterization/methods , Catheter Ablation/methods , Fluoroscopy/methods , Pattern Recognition, Automated/methods , Radiographic Image Interpretation, Computer-Assisted/methods , Surgery, Computer-Assisted/methods , Artificial Intelligence , Cardiac Catheterization/instrumentation , Cardiac Catheters , Catheter Ablation/instrumentation , Humans , Radiographic Image Enhancement/methods , Reproducibility of Results , Sensitivity and Specificity

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL