Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 29
Filter
1.
IEEE Trans Pattern Anal Mach Intell ; 46(5): 3321-3333, 2024 May.
Article in English | MEDLINE | ID: mdl-38096092

ABSTRACT

Uncertainty quantification for inverse problems in imaging has drawn much attention lately. Existing approaches towards this task define uncertainty regions based on probable values per pixel, while ignoring spatial correlations within the image, resulting in an exaggerated volume of uncertainty. In this paper, we propose PUQ (Principal Uncertainty Quantification) - a novel definition and corresponding analysis of uncertainty regions that takes into account spatial relationships within the image, thus providing reduced volume regions. Using recent advancements in generative models, we derive uncertainty intervals around principal components of the empirical posterior distribution, forming an ambiguity region that guarantees the inclusion of true unseen values with a user-defined confidence probability. To improve computational efficiency and interpretability, we also guarantee the recovery of true unseen values using only a few principal directions, resulting in more informative uncertainty regions. Our approach is verified through experiments on image colorization, super-resolution, and inpainting; its effectiveness is shown through comparison to baseline methods, demonstrating significantly tighter uncertainty regions.

2.
Sensors (Basel) ; 22(24)2022 Dec 14.
Article in English | MEDLINE | ID: mdl-36560175

ABSTRACT

This paper considers the problem of finding a landing spot for a drone in a dense urban environment. The conflicting requirements of fast exploration and high resolution are solved using a multi-resolution approach, by which visual information is collected by the drone at decreasing altitudes so that the spatial resolution of the acquired images increases monotonically. A probability distribution is used to capture the uncertainty of the decision process for each terrain patch. The distributions are updated as information from different altitudes is collected. When the confidence level for one of the patches becomes larger than a prespecified threshold, suitability for landing is declared. One of the main building blocks of the approach is a semantic segmentation algorithm that attaches probabilities to each pixel of a single view. The decision algorithm combines these probabilities with a priori data and previous measurements to obtain the best estimates. Feasibility is illustrated by presenting several examples generated by a realistic closed-loop simulator.


Subject(s)
Algorithms , Altitude , Uncertainty , Probability
3.
Surg Endosc ; 36(12): 9215-9223, 2022 12.
Article in English | MEDLINE | ID: mdl-35941306

ABSTRACT

BACKGROUND: The potential role and benefits of AI in surgery has yet to be determined. This study is a first step in developing an AI system for minimizing adverse events and improving patient's safety. We developed an Artificial Intelligence (AI) algorithm and evaluated its performance in recognizing surgical phases of laparoscopic cholecystectomy (LC) videos spanning a range of complexities. METHODS: A set of 371 LC videos with various complexity levels and containing adverse events was collected from five hospitals. Two expert surgeons segmented each video into 10 phases including Calot's triangle dissection and clipping and cutting. For each video, adverse events were also annotated when present (major bleeding; gallbladder perforation; major bile leakage; and incidental finding) and complexity level (on a scale of 1-5) was also recorded. The dataset was then split in an 80:20 ratio (294 and 77 videos), stratified by complexity, hospital, and adverse events to train and test the AI model, respectively. The AI-surgeon agreement was then compared to the agreement between surgeons. RESULTS: The mean accuracy of the AI model for surgical phase recognition was 89% [95% CI 87.1%, 90.6%], comparable to the mean inter-annotator agreement of 90% [95% CI 89.4%, 90.5%]. The model's accuracy was inversely associated with procedure complexity, decreasing from 92% (complexity level 1) to 88% (complexity level 3) to 81% (complexity level 5). CONCLUSION: The AI model successfully identified surgical phases in both simple and complex LC procedures. Further validation and system training is warranted to evaluate its potential applications such as to increase patient safety during surgery.


Subject(s)
Cholecystectomy, Laparoscopic , Gallbladder Diseases , Humans , Cholecystectomy, Laparoscopic/methods , Artificial Intelligence , Gallbladder Diseases/surgery , Dissection
4.
Gastrointest Endosc ; 94(6): 1099-1109.e10, 2021 Dec.
Article in English | MEDLINE | ID: mdl-34216598

ABSTRACT

BACKGROUND AND AIMS: Colorectal cancer is a leading cause of death. Colonoscopy is the criterion standard for detection and removal of precancerous lesions and has been shown to reduce mortality. The polyp miss rate during colonoscopies is 22% to 28%. DEEP DEtection of Elusive Polyps (DEEP2) is a new polyp detection system based on deep learning that alerts the operator in real time to the presence and location of polyps. The primary outcome was the performance of DEEP2 on the detection of elusive polyps. METHODS: The DEEP2 system was trained on 3611 hours of colonoscopy videos derived from 2 sources and was validated on a set comprising 1393 hours from a third unrelated source. Ground truth labeling was provided by offline gastroenterologist annotators who were able to watch the video in slow motion and pause and rewind as required. To assess applicability, stability, and user experience and to obtain some preliminary data on performance in a real-life scenario, a preliminary prospective clinical validation study was performed comprising 100 procedures. RESULTS: DEEP2 achieved a sensitivity of 97.1% at 4.6 false alarms per video for all polyps and of 88.5% and 84.9% for polyps in the field of view for less than 5 and 2 seconds, respectively. DEEP2 was able to detect polyps not seen by live real-time endoscopists or offline annotators in an average of .22 polyps per sequence. In the clinical validation study, the system detected an average of .89 additional polyps per procedure. No adverse events occurred. CONCLUSIONS: DEEP2 has a high sensitivity for polyp detection and was effective in increasing the detection of polyps both in colonoscopy videos and in real procedures with a low number of false alarms. (Clinical trial registration number: NCT04693078.).


Subject(s)
Adenomatous Polyps , Colonic Polyps , Colorectal Neoplasms , Artificial Intelligence , Colonic Polyps/diagnosis , Colonoscopy , Colorectal Neoplasms/diagnosis , Humans , Prospective Studies
5.
IEEE Trans Med Imaging ; 39(11): 3451-3462, 2020 11.
Article in English | MEDLINE | ID: mdl-32746092

ABSTRACT

Colonoscopy is tool of choice for preventing Colorectal Cancer, by detecting and removing polyps before they become cancerous. However, colonoscopy is hampered by the fact that endoscopists routinely miss 22-28% of polyps. While some of these missed polyps appear in the endoscopist's field of view, others are missed simply because of substandard coverage of the procedure, i.e. not all of the colon is seen. This paper attempts to rectify the problem of substandard coverage in colonoscopy through the introduction of the C2D2 (Colonoscopy Coverage Deficiency via Depth) algorithm which detects deficient coverage, and can thereby alert the endoscopist to revisit a given area. More specifically, C2D2 consists of two separate algorithms: the first performs depth estimation of the colon given an ordinary RGB video stream; while the second computes coverage given these depth estimates. Rather than compute coverage for the entire colon, our algorithm computes coverage locally, on a segment-by-segment basis; C2D2 can then indicate in real-time whether a particular area of the colon has suffered from deficient coverage, and if so the endoscopist can return to that area. Our coverage algorithm is the first such algorithm to be evaluated in a large-scale way; while our depth estimation technique is the first calibration-free unsupervised method applied to colonoscopies. The C2D2 algorithm achieves state of the art results in the detection of deficient coverage. On synthetic sequences with ground truth, it is 2.4 times more accurate than human experts; while on real sequences, C2D2 achieves a 93.0% agreement with experts.


Subject(s)
Colonic Neoplasms , Colonic Polyps , Algorithms , Colonic Polyps/diagnostic imaging , Colonoscopy , Humans
6.
Sensors (Basel) ; 19(3)2019 Feb 02.
Article in English | MEDLINE | ID: mdl-30717361

ABSTRACT

This paper presents a global monocular indoor positioning system for a robotic vehicle starting from a known pose. The proposed system does not depend on a dense 3D map, require prior environment exploration or installation, or rely on the scene remaining the same, photometrically or geometrically. The approach presents a new way of providing global positioning relying on the sparse knowledge of the building floorplan by utilizing special algorithms to resolve the unknown scale through wall⁻plane association. This Wall Plane Fusion algorithm presented finds correspondences between walls of the floorplan and planar structures present in the 3D point cloud. In order to extract planes from point clouds that contain scale ambiguity, the Scale Invariant Planar RANSAC (SIPR) algorithm was developed. The best wall⁻plane correspondence is used as an external constraint to a custom Bundle Adjustment optimization which refines the motion estimation solution and enforces a global scale solution. A necessary condition is that only one wall needs to be in view. The feasibility of using the algorithms is tested with synthetic and real-world data; extensive testing is performed in an indoor simulation environment using the Unreal Engine and Microsoft Airsim. The system performs consistently across all three types of data. The tests presented in this paper show that the standard deviation of the error did not exceed 6 cm.

7.
IEEE Trans Pattern Anal Mach Intell ; 39(2): 411-416, 2017 02.
Article in English | MEDLINE | ID: mdl-27019475

ABSTRACT

Sparse and redundant representations, where signals are modeled as a combination of a few atoms from an overcomplete dictionary, is increasingly used in many image processing applications, such as denoising, super resolution, and classification. One common problem is learning a "good" dictionary for different tasks. In the classification task the aim is to learn a dictionary that also takes training labels into account, and indeed there exist several approaches to this problem. One well-known technique is D-KSVD, which jointly learns a dictionary and a linear classifier using the K-SVD algorithm. LC-KSVD is a recent variation intended to further improve on this idea by adding an explicit label consistency term to the optimization problem, so that different classes are represented by different dictionary atoms. In this work we prove that, under identical initialization conditions, LC-KSVD with uniform atom allocation is in fact a reformulation of D-KSVD: given the regularization parameters of LC-KSVD, we give a closed-form expression for the equivalent D-KSVD regularization parameter, assuming the LC-KSVD's initialization scheme is used. We confirm this by reproducing several of the original LC-KSVD experiments.

8.
J Exp Biol ; 218(Pt 13): 2097-105, 2015 Jul.
Article in English | MEDLINE | ID: mdl-26157161

ABSTRACT

Chameleons perform large-amplitude eye movements that are frequently referred to as independent, or disconjugate. When prey (an insect) is detected, the chameleon's eyes converge to view it binocularly and 'lock' in their sockets so that subsequent visual tracking is by head movements. However, the extent of the eyes' independence is unclear. For example, can a chameleon visually track two small targets simultaneously and monocularly, i.e. one with each eye? This is of special interest because eye movements in ectotherms and birds are frequently independent, with optic nerves that are fully decussated and intertectal connections that are not as developed as in mammals. Here, we demonstrate that chameleons presented with two small targets moving in opposite directions can perform simultaneous, smooth, monocular, visual tracking. To our knowledge, this is the first demonstration of such a capacity. The fine patterns of the eye movements in monocular tracking were composed of alternating, longer, 'smooth' phases and abrupt 'step' events, similar to smooth pursuits and saccades. Monocular tracking differed significantly from binocular tracking with respect to both 'smooth' phases and 'step' events. We suggest that in chameleons, eye movements are not simply 'independent'. Rather, at the gross level, eye movements are (i) disconjugate during scanning, (ii) conjugate during binocular tracking and (iii) disconjugate, but coordinated, during monocular tracking. At the fine level, eye movements are disconjugate in all cases. These results support the view that in vertebrates, basic monocular control is under a higher level of regulation that dictates the eyes' level of coordination according to context.


Subject(s)
Eye Movements/physiology , Lizards/physiology , Vision, Monocular , Animals , Predatory Behavior/physiology , Psychomotor Performance , Pursuit, Smooth/physiology , Saccades/physiology
9.
Comput Med Imaging Graph ; 43: 150-64, 2015 Jul.
Article in English | MEDLINE | ID: mdl-25804442

ABSTRACT

In this paper, we introduce a novel method for detection and segmentation of crypts in colon biopsies. Most of the approaches proposed in the literature try to segment the crypts using only the biopsy image without understanding the meaning of each pixel. The proposed method differs in that we segment the crypts using an automatically generated pixel-level classification image of the original biopsy image and handle the artifacts due to the sectioning process and variance in color, shape and size of the crypts. The biopsy image pixels are classified to nuclei, immune system, lumen, cytoplasm, stroma and goblet cells. The crypts are then segmented using a novel active contour approach, where the external force is determined by the semantics of each pixel and the model of the crypt. The active contour is applied for every lumen candidate detected using the pixel-level classification. Finally, a false positive crypt elimination process is performed to remove segmentation errors. This is done by measuring their adherence to the crypt model using the pixel level classification results. The method was tested on 54 biopsy images containing 4944 healthy and 2236 cancerous crypts, resulting in 87% detection of the crypts with 9% of false positive segments (segments that do not represent a crypt). The segmentation accuracy of the true positive segments is 96%.


Subject(s)
Colonic Neoplasms/pathology , Histological Techniques , Image Processing, Computer-Assisted/methods , Pattern Recognition, Automated/methods , Biopsy , Color , Humans , Predictive Value of Tests , Staining and Labeling
10.
IEEE Trans Pattern Anal Mach Intell ; 36(3): 620-1, 2014 Mar.
Article in English | MEDLINE | ID: mdl-24605376

ABSTRACT

A methodology for integrating multiple cues for tracking was proposed in several papers. These papers claim that, unlike other methodologies, conditional independence of the cues is not assumed. This brief communication 1) refutes this claim and 2) points out other major problems in the methodology.

11.
IEEE Trans Pattern Anal Mach Intell ; 35(7): 1622-34, 2013 Jul.
Article in English | MEDLINE | ID: mdl-23681991

ABSTRACT

We revisit the problem of specific object recognition using color distributions. In some applications--such as specific person identification--it is highly likely that the color distributions will be multimodal and hence contain a special structure. Although the color distribution changes under different lighting conditions, some aspects of its structure turn out to be invariants. We refer to this structure as an intradistribution structure, and show that it is invariant under a wide range of imaging conditions while being discriminative enough to be practical. Our signature uses shape context descriptors to represent the intradistribution structure. Assuming the widely used diagonal model, we validate that our signature is invariant under certain illumination changes. Experimentally, we use color information as the only cue to obtain good recognition performance on publicly available databases covering both indoor and outdoor conditions. Combining our approach with the complementary covariance descriptor, we demonstrate results exceeding the state-of-the-art performance on the challenging VIPeR and CAVIAR4REID databases.

12.
IEEE Trans Pattern Anal Mach Intell ; 34(12): 2327-40, 2012 Dec.
Article in English | MEDLINE | ID: mdl-22331857

ABSTRACT

It is quite common that multiple human observers attend to a single static interest point. This is known as a mutual awareness event (MAWE). A preferred way to monitor these situations is with a camera that captures the human observers while using existing face detection and head pose estimation algorithms. The current work studies the underlying geometric constraints of MAWEs and reformulates them in terms of image measurements. The constraints are then used in a method that 1) detects whether such an interest point does exist, 2) determines where it is located, 3) identifies who was attending to it, and 4) reports where and when each observer was while attending to it. The method is also applied on another interesting event when a single moving human observer fixates on a single static interest point. The method can deal with the general case of an uncalibrated camera in a general environment. This is in contrast to other work on similar problems that inherently assumes a known environment or a calibrated camera. The method was tested on about 75 images from various scenes and robustly detects MAWEs and estimates their related attributes. Most of the images were found by searching the Internet.


Subject(s)
Algorithms , Awareness/physiology , Image Processing, Computer-Assisted/methods , Posture/physiology , Signal Processing, Computer-Assisted , Social Behavior , Bayes Theorem , Databases, Factual , Face/anatomy & histology , Humans , Imaging, Three-Dimensional/methods , Mass Behavior
13.
IEEE Trans Pattern Anal Mach Intell ; 33(2): 406-11, 2011 Feb.
Article in English | MEDLINE | ID: mdl-20820078

ABSTRACT

A novel vision-based navigation algorithm is proposed. The gray levels of two images, together with a Digital Terrain Map (DTM), are directly utilized to define constraints on the navigation parameters. The feasibility of the algorithm is examined both under a simulated environment and using real flight data.

14.
Article in English | MEDLINE | ID: mdl-19964161

ABSTRACT

Spine curvature and posture are important to sustain healthy back. Incorrect spine configuration can add strain to muscles and put stress on the spine, leading to low back pain (LBP). We propose new method for analyzing spine curvature in 3D, using CT imaging. The proposed method is based on two novel concepts: the spine curvature is derived from spinal canal centerline, and evaluation of the curve is carried out against a model based on healthy individuals. We show results of curvature analysis of healthy population, pathological (scoliosis) patients, and patients having nonspecific chronic LBP.


Subject(s)
Models, Anatomic , Radiographic Image Interpretation, Computer-Assisted/methods , Spinal Curvatures/diagnostic imaging , Spinal Curvatures/pathology , Spine/diagnostic imaging , Spine/pathology , Tomography, X-Ray Computed/methods , Algorithms , Computer Simulation , Female , Humans , Imaging, Three-Dimensional/methods , Male , Radiographic Image Enhancement/methods , Reproducibility of Results , Sensitivity and Specificity
15.
IEEE Trans Pattern Anal Mach Intell ; 31(9): 1708-14, 2009 Sep.
Article in English | MEDLINE | ID: mdl-19574629

ABSTRACT

We consider curve evolution based on comparing distributions of features, and its applications for scene segmentation. In the first part, we promote using cross-bin metrics such as the Earth Mover's Distance (EMD), instead of standard bin-wise metrics as the Bhattacharyya or Kullback-Leibler metrics. To derive flow equations for minimizing functionals involving the EMD, we employ a tractable expression for calculating EMD between one-dimensional distributions. We then apply the derived flows to various examples of single image segmentation, and to scene analysis using video data. In the latter, we consider the problem of segmenting a scene to spatial regions in which different activities occur. We use a nonparametric local representation of the regions by considering multiple one-dimensional histograms of normalized spatiotemporal derivatives. We then obtain semisupervised segmentation of regions using the flows derived in the first part of the paper. Our results are demonstrated on challenging surveillance scenes, and compare favorably with state-of-the-art results using parametric representations by dynamic systems or mixtures of them.


Subject(s)
Algorithms , Artificial Intelligence , Image Enhancement/methods , Image Interpretation, Computer-Assisted/methods , Pattern Recognition, Automated/methods , Reproducibility of Results , Sensitivity and Specificity
16.
IEEE Trans Med Imaging ; 28(8): 1317-24, 2009 Aug.
Article in English | MEDLINE | ID: mdl-19258197

ABSTRACT

In this paper, we address the problem of fully automated decomposition of hyperspectral images for transmission light microscopy. The hyperspectral images are decomposed into spectrally homogeneous compounds. The resulting compounds are described by their spectral characteristics and optical density. We present the multiplicative physical model of image formation in transmission light microscopy, justify reduction of a hyperspectral image decomposition problem to a blind source separation problem, and provide method for hyperspectral restoration of separated compounds. In our approach, dimensionality reduction using principal component analysis (PCA) is followed by a blind source separation (BSS) algorithm. The BSS method is based on sparsifying transformation of observed images and relative Newton optimization procedure. The presented method was verified on hyperspectral images of biological tissues. The method was compared to the existing approach based on nonnegative matrix factorization. Experiments showed that the presented method is faster and better separates the biological compounds from imaging artifacts. The results obtained in this work may be used for improving automatic microscope hardware calibration and computer-aided diagnostics.


Subject(s)
Image Processing, Computer-Assisted/methods , Microscopy/methods , Algorithms , Animals , Arabinose/chemistry , Hematoxylin/chemistry , Imino Furanoses/chemistry , Light , Mice , Myocardium/cytology , Principal Component Analysis , Sugar Alcohols/chemistry
17.
IEEE Trans Pattern Anal Mach Intell ; 31(2): 193-209, 2009 Feb.
Article in English | MEDLINE | ID: mdl-19110488

ABSTRACT

Resolution of different types of loops in handwritten script presents a difficult task and is an important step in many classic word recognition systems, writer modeling, and signature verification. When processing a handwritten script, a great deal of ambiguity occurs when strokes overlap, merge, or intersect. This paper presents a novel loop modeling and contour-based handwriting analysis that improves loop investigation. We show excellent results on various loop resolution scenarios, including axial loop understanding and collapsed loop recovery. We demonstrate our approach for loop investigation on several realistic data sets of static binary images and compare with the ground truth of the genuine online signal.


Subject(s)
Algorithms , Artificial Intelligence , Electronic Data Processing/methods , Handwriting , Image Interpretation, Computer-Assisted/methods , Information Storage and Retrieval/methods , Pattern Recognition, Automated/methods , Computer Graphics , Documentation , Humans , Image Enhancement/methods , Numerical Analysis, Computer-Assisted , Online Systems , Reproducibility of Results , Sensitivity and Specificity , Signal Processing, Computer-Assisted , Subtraction Technique , User-Computer Interface
18.
IEEE Trans Pattern Anal Mach Intell ; 31(1): 164-71, 2009 Jan.
Article in English | MEDLINE | ID: mdl-19029554

ABSTRACT

Kernel-based trackers aggregate image features within the support of a kernel (a mask) regardless of their spatial structure. These trackers spatially fit the kernel (usually in location and in scale) such that a function of the aggregate is optimized. We propose a kernel-based visual tracker that exploits the constancy of color and the presence of color edges along the target boundary. The tracker estimates the best affinity of a spatially aligned pair of kernels, one of which is color-related and the other of which is object boundary-related. In a sense, this work extends previous kernel-based trackers by incorporating the object boundary cue into the tracking process and by allowing the kernels to be affinely transformed instead of only translated and isotropically scaled. These two extensions make for more precise target localization. A more accurately localized target also facilitates safer updating of its reference color model, further enhancing the tracker's robustness. The improved tracking is demonstrated for several challenging image sequences.


Subject(s)
Algorithms , Artificial Intelligence , Color , Colorimetry/methods , Image Interpretation, Computer-Assisted/methods , Models, Theoretical , Pattern Recognition, Automated/methods , Computer Simulation , Cues
19.
IEEE Trans Pattern Anal Mach Intell ; 30(9): 1572-88, 2008 Sep.
Article in English | MEDLINE | ID: mdl-18617716

ABSTRACT

This paper addresses the problem of visual tracking under very general conditions: a possibly non-rigid target whose appearance may drastically change over time; general camera motion; a 3D scene; and no a priori information except initialization. This is in contrast to the vast majority of trackers which rely on some limited model in which, for example, the target's appearance is known a priori or restricted, the scene is planar, or a pan tilt zoom camera is used. Their goal is to achieve speed and robustness, but their limited context may cause them to fail in the more general case. The proposed tracker works by approximating, in each frame, a PDF (probability distribution function) of the target's bitmap and then estimating the maximum a posteriori bitmap. The PDF is marginalized over all possible motions per pixel, thus avoiding the stage in which optical flow is determined. This is an advantage over other general-context trackers that do not use the motion cue at all or rely on the error-prone calculation of optical flow. Using a Gibbs distribution with respect to the first-order neighborhood system yields a bitmap PDF whose maximization may be transformed into that of a quadratic pseudo-Boolean function, the maximum of which is approximated via a reduction to a maximum-flow problem. Many experiments were conducted to demonstrate that the tracker is able to track under the aforementioned general context.


Subject(s)
Algorithms , Artificial Intelligence , Image Enhancement/methods , Image Interpretation, Computer-Assisted/methods , Imaging, Three-Dimensional/methods , Pattern Recognition, Automated/methods , Motion , Reproducibility of Results , Sensitivity and Specificity , Signal Processing, Computer-Assisted
20.
IEEE Trans Pattern Anal Mach Intell ; 30(3): 555-60, 2008 Mar.
Article in English | MEDLINE | ID: mdl-18195449

ABSTRACT

We present a novel algorithm for detection of certain types of unusual events. The algorithm is based on multiple local monitors which collect low-level statistics. Each local monitor produces an alert if its current measurement is unusual, and these alerts are integrated to a final decision regarding the existence of an unusual event. Our algorithm satisfies a set of requirements that are critical for successful deployment of any large-scale surveillance system. In particular it requires a minimal setup (taking only a few minutes) and is fully automatic afterwards. Since it is not based on objects' tracks, it is robust and works well in crowded scenes where tracking-based algorithms are likely to fail. The algorithm is effective as soon as sufficient low-level observations representing the routine activity have been collected, which usually happens after a few minutes. Our algorithm runs in realtime. It was tested on a variety of real-life crowded scenes. A ground-truth was extracted for these scenes, with respect to which detection and false-alarm rates are reported.


Subject(s)
Artificial Intelligence , Image Enhancement/methods , Image Interpretation, Computer-Assisted/methods , Pattern Recognition, Automated/methods , Photography/methods , Security Measures , Video Recording/methods , Algorithms , Computer Systems , Photography/instrumentation , Reproducibility of Results , Sensitivity and Specificity , Video Recording/instrumentation
SELECTION OF CITATIONS
SEARCH DETAIL
...