Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 10 de 10
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
Med Image Anal ; 72: 102100, 2021 08.
Artigo em Inglês | MEDLINE | ID: mdl-34102478

RESUMO

Colonoscopy is the gold standard for pre-cancerous polyps screening and treatment. The polyp detection rate is highly tied to the percentage of surveyed colonic surface. However, current colonoscopy technique cannot guarantee that all the colonic surface is well examined because of incomplete camera orientations and of occlusions. The missing regions can hardly be noticed in a continuous first-person perspective. Therefore, a useful contribution would be an automatic system that can compute missing regions from an endoscopic video in real-time and alert the endoscopists when a large missing region is detected. We present a novel method that reconstructs dense chunks of a 3D colon in real time, leaving the unsurveyed part unreconstructed. The method combines a standard SLAM system with a depth and pose prediction network to achieve much more robust tracking and less drift. It addresses the difficulties for colonoscopic images of existing simultaneous localization and mapping (SLAM) systems and end-to-end deep learning methods.


Assuntos
Pólipos do Colo , Colonoscopia , Colo/diagnóstico por imagem , Pólipos do Colo/diagnóstico por imagem , Humanos
3.
Laryngoscope ; 129 Suppl 3: S1-S11, 2019 10.
Artigo em Inglês | MEDLINE | ID: mdl-31260127

RESUMO

OBJECTIVES/HYPOTHESIS: Augmented reality (AR) allows for the addition of transparent virtual images and video to one's view of a physical environment. Our objective was to develop a head-worn, AR system for accurate, intraoperative localization of pathology and normal anatomic landmarks during open head and neck surgery. STUDY DESIGN: Face validity and case study. METHODS: A protocol was developed for the creation of three-dimensional (3D) virtual models based on computed tomography scans. Using the HoloLens AR platform, a novel system of registration and tracking was developed. Accuracy was determined in relation to actual physical landmarks. A face validity study was then performed in which otolaryngologists were asked to evaluate the technology and perform a simulated surgical task using AR image guidance. A case study highlighting the potential usefulness of the technology is also presented. RESULTS: An AR system was developed for intraoperative 3D visualization and localization. The average error in measurement of accuracy was 2.47 ± 0.46 millimeters (1.99, 3.30). The face validity study supports the potential of this system to improve safety and efficiency in open head and neck surgical procedures. CONCLUSIONS: An AR system for accurate localization of pathology and normal anatomic landmarks of the head and neck is feasible with current technology. A face validity study reveals the potential value of the system in intraoperative image guidance. This application of AR, among others in the field of otolaryngology-head and neck surgery, promises to improve surgical efficiency and patient safety in the operating room. LEVEL OF EVIDENCE: 2b Laryngoscope, 129:S1-S11, 2019.


Assuntos
Imageamento Tridimensional/métodos , Otolaringologia/métodos , Procedimentos Cirúrgicos Otorrinolaringológicos/métodos , Tomografia Computadorizada por Raios X/métodos , Realidade Virtual , Pontos de Referência Anatômicos/cirurgia , Simulação por Computador , Estudos de Viabilidade , Humanos
4.
IEEE Trans Vis Comput Graph ; 24(11): 2993-3004, 2018 11.
Artigo em Inglês | MEDLINE | ID: mdl-30207957

RESUMO

We propose a new approach for 3D reconstruction of dynamic indoor and outdoor scenes in everyday environments, leveraging only cameras worn by a user. This approach allows 3D reconstruction of experiences at any location and virtual tours from anywhere. The key innovation of the proposed ego-centric reconstruction system is to capture the wearer's body pose and facial expression from near-body views, e.g. cameras on the user's glasses, and to capture the surrounding environment using outward-facing views. The main challenge of the ego-centric reconstruction, however, is the poor coverage of the near-body views - that is, the user's body and face are observed from vantage points that are convenient for wear but inconvenient for capture. To overcome these challenges, we propose a parametric-model-based approach to user motion estimation. This approach utilizes convolutional neural networks (CNNs) for near-view body pose estimation, and we introduce a CNN-based approach for facial expression estimation that combines audio and video. For each time-point during capture, the intermediate model-based reconstructions from these systems are used to re-target a high-fidelity pre-scanned model of the user. We demonstrate that the proposed self-sufficient, head-worn capture system is capable of reconstructing the wearer's movements and their surrounding environment in both indoor and outdoor situations without any additional views. As a proof of concept, we show how the resulting 3D-plus-time reconstruction can be immersively experienced within a virtual reality system (e.g., the HTC Vive). We expect that the size of the proposed egocentric capture-and-reconstruction system will eventually be reduced to fit within future AR glasses, and will be widely useful for immersive 3D telepresence, virtual tours, and general use-anywhere 3D content creation.


Assuntos
Expressão Facial , Imageamento Tridimensional/métodos , Postura/fisiologia , Interface Usuário-Computador , Gravação em Vídeo/métodos , Humanos , Internet , Redes Neurais de Computação
5.
IEEE Trans Pattern Anal Mach Intell ; 40(9): 2223-2237, 2018 09.
Artigo em Inglês | MEDLINE | ID: mdl-28841551

RESUMO

We target the problem of sparse 3D reconstruction of dynamic objects observed by multiple unsynchronized video cameras with unknown temporal overlap. To this end, we develop a framework to recover the unknown structure without sequencing information across video sequences. Our proposed compressed sensing framework poses the estimation of 3D structure as the problem of dictionary learning, where the dictionary is defined as an aggregation of the temporally varying 3D structures. Given the smooth motion of dynamic objects, we observe any element in the dictionary can be well approximated by a sparse linear combination of other elements in the same dictionary (i.e., self-expression). Our formulation optimizes a biconvex cost function that leverages a compressed sensing formulation and enforces both structural dependency coherence across video streams, as well as motion smoothness across estimates from common video sources. We further analyze the reconstructability of our approach under different capture scenarios, and its comparison and relation to existing methods. Experimental results on large amounts of synthetic data as well as real imagery demonstrate the effectiveness of our approach.

6.
IEEE Trans Vis Comput Graph ; 22(11): 2358-67, 2016 11.
Artigo em Inglês | MEDLINE | ID: mdl-27479967

RESUMO

To maintain a reliable registration of the virtual world with the real world, augmented reality (AR) applications require highly accurate, low-latency tracking of the device. In this paper, we propose a novel method for performing this fast 6-DOF head pose tracking using a cluster of rolling shutter cameras. The key idea is that a rolling shutter camera works by capturing the rows of an image in rapid succession, essentially acting as a high-frequency 1D image sensor. By integrating multiple rolling shutter cameras on the AR device, our tracker is able to perform 6-DOF markerless tracking in a static indoor environment with minimal latency. Compared to state-of-the-art tracking systems, this tracking approach performs at significantly higher frequency, and it works in generalized environments. To demonstrate the feasibility of our system, we present thorough evaluations on synthetically generated data with tracking frequencies reaching 56.7 kHz. We further validate the method's accuracy on real-world images collected from a prototype of our tracking system against ground truth data using standard commodity GoPro cameras capturing at 120 Hz frame rate.

7.
IEEE Trans Vis Comput Graph ; 20(2): 262-75, 2014 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-24356368

RESUMO

Given the growth of Internet photo collections, we now have a visual index of all major cities and tourist sites in the world. However, it is still a difficult task to capture that perfect shot with your own camera when visiting these places, especially when your camera itself has limitations, such as a limited field of view. In this paper, we propose a framework to overcome the imperfections of personal photographs of tourist sites using the rich information provided by large-scale Internet photo collections. Our method deploys state-of-the-art techniques for constructing initial 3D models from photo collections. The same techniques are then used to register personal photographs to these models, allowing us to augment personal 2D images with 3D information. This strong available scene prior allows us to address a number of traditionally challenging image enhancement techniques and achieve high-quality results using simple and robust algorithms. Specifically, we demonstrate automatic foreground segmentation, mono-to-stereo conversion, field-of-view expansion, photometric enhancement, and additionally automatic annotation with geolocation and tags. Our method clearly demonstrates some possible benefits of employing the rich information contained in online photo databases to efficiently enhance and augment one's own personal photographs.

8.
IEEE Trans Pattern Anal Mach Intell ; 35(8): 2022-38, 2013 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-23787350

RESUMO

A computational problem that arises frequently in computer vision is that of estimating the parameters of a model from data that have been contaminated by noise and outliers. More generally, any practical system that seeks to estimate quantities from noisy data measurements must have at its core some means of dealing with data contamination. The random sample consensus (RANSAC) algorithm is one of the most popular tools for robust estimation. Recent years have seen an explosion of activity in this area, leading to the development of a number of techniques that improve upon the efficiency and robustness of the basic RANSAC algorithm. In this paper, we present a comprehensive overview of recent research in RANSAC-based robust estimation by analyzing and comparing various approaches that have been explored over the years. We provide a common context for this analysis by introducing a new framework for robust estimation, which we call Universal RANSAC (USAC). USAC extends the simple hypothesize-and-verify structure of standard RANSAC to incorporate a number of important practical and computational considerations. In addition, we provide a general-purpose C++ software library that implements the USAC framework by leveraging state-of-the-art algorithms for the various modules. This implementation thus addresses many of the limitations of standard RANSAC within a single unified package. We benchmark the performance of the algorithm on a large collection of estimation problems. The implementation we provide can be used by researchers either as a stand-alone tool for robust estimation or as a benchmark for evaluating new techniques.

9.
Med Image Anal ; 16(1): 160-76, 2012 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-21920798

RESUMO

Specialists often need to browse through libraries containing many diagnostic hysteroscopy videos searching for similar cases, or even to review the video of one particular case. Video searching and browsing can be used in many situations, like in case-based diagnosis when videos of previously diagnosed cases are compared, in case referrals, in reviewing the patient records, as well as for supporting medical research (e.g. in human reproduction). However, in terms of visual content, diagnostic hysteroscopy videos contain lots of information, but only a reduced number of frames are actually useful for diagnosis/prognosis purposes. In order to facilitate the browsing task, we propose in this paper a technique for estimating the clinical relevance of video segments in diagnostic hysteroscopies. Basically, the proposed technique associates clinical relevance with the attention attracted by a diagnostic hysteroscopy video segment during the video acquisition (i.e. during the diagnostic hysteroscopy conducted by a specialist). We show that the resulting video summary allows specialists to browse the video contents nonlinearly, while avoiding spending time on spurious visual information. In this work, we review state-of-art methods for summarizing general videos and how they apply to diagnostic hysteroscopy videos (considering their specific characteristics), and conclude that our proposed method contributes to the field with a summarization and representation method specific for video hysteroscopies. The experimental results indicate that our method tends to produce compact video summaries without discarding clinically relevant information.


Assuntos
Atenção , Mineração de Dados/métodos , Histeroscopia/métodos , Interpretação de Imagem Assistida por Computador/métodos , Sistemas de Informação em Radiologia , Interface Usuário-Computador , Gravação em Vídeo/métodos , Sistemas de Gerenciamento de Base de Dados , Feminino , Humanos
10.
Artigo em Inglês | MEDLINE | ID: mdl-25729263

RESUMO

Convex and continuous energy formulations for low level vision problems enable efficient search procedures for the corresponding globally optimal solutions. In this work we extend the well-established continuous, isotropic capacity-based maximal flow framework to the anisotropic setting. By using powerful results from convex analysis, a very simple and efficient minimization procedure is derived. Further, we show that many important properties carry over to the new anisotropic framework, e.g. globally optimal binary results can be achieved simply by thresholding the continuous solution. In addition, we unify the anisotropic continuous maximal flow approach with a recently proposed convex and continuous formulation for Markov random fields, thereby allowing more general smoothness priors to be incorporated. Dense stereo results are included to illustrate the capabilities of the proposed approach.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...