Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 12 de 12
Filter
1.
Article in English | MEDLINE | ID: mdl-38502630

ABSTRACT

Humans perceive and construct the world as an arrangement of simple parametric models. In particular, we can often describe man-made environments using volumetric primitives such as cuboids or cylinders. Inferring these primitives is important for attaining high-level, abstract scene descriptions. Previous approaches for primitive-based abstraction estimate shape parameters directly and are only able to reproduce simple objects. In contrast, we propose a robust estimator for primitive fitting, which meaningfully abstracts complex real-world environments using cuboids. A RANSAC estimator guided by a neural network fits these primitives to a depth map. We condition the network on previously detected parts of the scene, parsing it one-by-one. To obtain cuboids from single RGB images, we additionally optimise a depth estimation CNN end-to-end. Naively minimising point-to-primitive distances leads to large or spurious cuboids occluding parts of the scene. We thus propose an improved occlusion-aware distance metric correctly handling opaque scenes. Furthermore, we present a neural network based cuboid solver which provides more parsimonious scene abstractions while also reducing inference time. The proposed algorithm does not require labour-intensive labels, such as cuboid annotations, for training. Results on the NYU Depth v2 dataset demonstrate that the proposed algorithm successfully abstracts cluttered real-world 3D scene layouts.

2.
IEEE Trans Pattern Anal Mach Intell ; 45(9): 11169-11183, 2023 Sep.
Article in English | MEDLINE | ID: mdl-37074895

ABSTRACT

Different objects in the same scene are more or less related to each other, but only a limited number of these relationships are noteworthy. Inspired by Detection Transformer, which excels in object detection, we view scene graph generation as a set prediction problem. In this article, we propose an end-to-end scene graph generation model Relation Transformer (RelTR), which has an encoder-decoder architecture. The encoder reasons about the visual feature context while the decoder infers a fixed-size set of triplets subject-predicate-object using different types of attention mechanisms with coupled subject and object queries. We design a set prediction loss performing the matching between the ground truth and predicted triplets for the end-to-end training. In contrast to most existing scene graph generation methods, RelTR is a one-stage method that predicts sparse scene graphs directly only using visual appearance without combining entities and labeling all possible predicates. Extensive experiments on the Visual Genome, Open Images V6, and VRD datasets demonstrate the superior performance and fast inference of our model.

3.
Ultrasonography ; : 214-226, 2023.
Article in English | WPRIM (Western Pacific) | ID: wpr-969237

ABSTRACT

Purpose@#Carotid vessel wall volume (VWV) measurement on three-dimensional ultrasonography (3DUS) outperforms conventional two-dimensional ultrasonography for carotid atherosclerosis evaluation. Although time-saving semi-automated algorithms have been introduced, their clinical availability remains limited due to a lack of validation, particularly an extensive reliability analysis. This study compared inter-observer and intra-observer reliability between manual segmentation and semi-automated segmentation for carotid VWV measurements on 3DUS. @*Methods@#Thirty-one 3DUS volume datasets were prospectively acquired from 20 healthy subjects, aged >18 years, without previous stroke, transient ischemic attack, or cardiovascular disease. Five observers segmented all volume datasets both manually and semi-automatically. The process was repeated five times. Reliability was expressed by the intraclass correlation coefficient, supplemented by the coefficient of variation. @*Results@#Carotid VWV measurements using the common carotid artery (CCA) were more reliable than those using the internal carotid artery (ICA) or external carotid artery (ECA) for both manual and semiautomated segmentation (manual segmentation, CCA: inter-observer, 0.935; intra-observer, 0.934 to 0.966; ICA: inter-observer, 0.784; intra-observer, 0.756 to 0.878; ECA: inter-observer, 0.732; intraobserver, 0.919 to 0.962; semi-automated segmentation, CCA: inter-observer, 0.986; intra-observer, 0.954 to 0.993; ICA: inter-observer, 0.977; intra-observer, 0.958 to 0.978; ECA: inter-observer, 0.966; intra-observer, 0.884 to 0.937). Total carotid VWV measurements by manual (inter-observer, 0.922; intra-observer, 0.927 to 0.961) and semi-automated segmentation (inter-observer, 0.987; intra-observer, 0.968 to 0.989) were highly reliable. Semi-automated segmentation showed higher reliability than manual segmentation for both individual and total carotid VWV measurements. @*Conclusion@#3DUS carotid VWV measurements of the CCA are more reliable than measurements of the ICA and ECA. Total carotid VWV measurements are highly reliable. Semi-automated segmentation has higher reliability than manual segmentation.

4.
IEEE Trans Pattern Anal Mach Intell ; 44(11): 7778-7796, 2022 11.
Article in English | MEDLINE | ID: mdl-34613910

ABSTRACT

In he past decade, object detection has achieved significant progress in natural images but not in aerial images, due to the massive variations in the scale and orientation of objects caused by the bird's-eye view of aerial images. More importantly, the lack of large-scale benchmarks has become a major obstacle to the development of object detection in aerial images (ODAI). In this paper, we present a large-scale Dataset of Object deTection in Aerial images (DOTA) and comprehensive baselines for ODAI. The proposed DOTA dataset contains 1,793,658 object instances of 18 categories of oriented-bounding-box annotations collected from 11,268 aerial images. Based on this large-scale and well-annotated dataset, we build baselines covering 10 state-of-the-art algorithms with over 70 configurations, where the speed and accuracy performances of each model have been evaluated. Furthermore, we provide a code library for ODAI and build a website for evaluating different algorithms. Previous challenges run on DOTA have attracted more than 1300 teams worldwide. We believe that the expanded large-scale DOTA dataset, the extensive baselines, the code library and the challenges can facilitate the designs of robust algorithms and reproducible research on the problem of object detection in aerial images.


Subject(s)
Algorithms , Benchmarking
5.
Front Psychol ; 13: 1054249, 2022.
Article in English | MEDLINE | ID: mdl-36619026

ABSTRACT

Well-defined key competencies for students with autism spectrum disorders (ASD) help develop curriculum and pedagogies that emphasize what students with ASD are expected to learn, to know and to do. Most of the current research on the key competencies of ASD is theoretical and based on the social and cultural backgrounds of western countries. The key competencies defined by most of the research lack of the support of empirical evidence. This study sought to identify the key competencies of school-age students with ASD from the perspectives of teachers and parents. Based on the review of existing key competencies frameworks, a key competencies instrument that consisted of 76 learning outcome items in eight domain areas was developed. An online survey to explore the teachers' and parents' views of the key competencies was conducted with 1,618 teachers and 2,430 parents of students with ASD across China. The results showed that teachers believed that the key competencies should consist of eight domain areas including social-communication, learning skills, healthy living, play, motor, emotion, sensory processing, and cognition, while the cognition related competencies were not recognized by parents. The competencies in social-communication, learning skills, healthy living had higher variance contribution. From the perspective of teachers, the variance contribution of social communication was the highest, while from the perspective of parents, the variance contribution of learning skills was the largest. Taken together, the key competencies framework for students with ASD should include eight dimensions and 75 learning outcome items. The similarities and differences between the perspectives of the two group were discussed. The findings could provide empirical data to assist in developing educational guidelines and guide the development of models of support for students with ASD.

6.
Opt Express ; 28(2): 2263-2275, 2020 Jan 20.
Article in English | MEDLINE | ID: mdl-32121920

ABSTRACT

Digital projectors have been increasingly utilized in various commercial and scientific applications. However, they are prone to the out-of-focus blurring problem since their depth-of-fields are typically limited. In this paper, we explore the feasibility of utilizing a deep learning-based approach to analyze the spatially-varying and depth-dependent defocus properties of digital projectors. A multimodal displaying/imaging system is built for capturing images projected at various depths. Based on the constructed dataset containing well-aligned in-focus, out-of-focus, and depth images, we propose a novel multi-channel residual deep network model to learn the end-to-end mapping function between the in-focus and out-of-focus image patches captured at different spatial locations and depths. To the best of our knowledge, it is the first research work revealing that the complex spatially-varying and depth-dependent blurring effects can be accurately learned from a number of real-captured image pairs instead of being hand-crafted as before. Experimental results demonstrate that our proposed deep learning-based method significantly outperforms the state-of-the-art defocus kernel estimation techniques and thus leads to better out-of-focus compensation for extending the dynamic ranges of digital projectors.

7.
Clin Ophthalmol ; 12: 1877-1885, 2018.
Article in English | MEDLINE | ID: mdl-30310267

ABSTRACT

PURPOSE: To determine the levels of interleukin (IL)-6, vascular endothelial growth factor-A, platelet-derived growth factor, placental growth factor (PLGF), and other cytokines in the aqueous fluid of patients with neovascular age-related macular degeneration who respond poorly to ranibizumab. PATIENTS AND METHODS: This is an observational, prospective study. Thirty-two eyes from 30 patients were included in the study: 11 patients who responded poorly to ranibizumab and were switched to aflibercept (AF group), 8 patients who received ranibizumab and photodynamic therapy (PDT group), and 13 patients who responded to ranibizumab (control group). Aqueous fluid samples were collected for analysis of cytokine levels at baseline and after 1, 2, and 3 months of treatment. The effect of treatment on cytokine levels was compared between the study groups and between different time points using a linear mixed-effect regression model. RESULTS: In the AF group, there was an increase in vascular endothelial growth factor-C, IL-7, and angiopoeitin-2 levels (P=0.01) and a decrease in intercellular adhesion molecule and IL-17 levels (P=0.01) between baseline and 3 months. After adjustment for age, sex, race, and type of lesion at baseline, the PLGF level was higher (P=0.02) and the IL-7 level was lower (P=0.04) in the ranibizumab non-responder group than in the ranibizumab responder group. CONCLUSION: Switching from ranibizumab to aflibercept did not reduce intraocular levels of angiogenesis cytokines, but resulted in improvement of central subfield thickness. PLGF levels were higher in poor responders to ranibizumab. The response of lesions to medication might be related to the stage of choroidal neovascularization. TRIAL REGISTRATION: www.ClinicalTrial.gov (NCT02218177c).

9.
Biomed Tech (Berl) ; 61(4): 413-29, 2016 Aug 01.
Article in English | MEDLINE | ID: mdl-26351901

ABSTRACT

This paper presents a novel fully automatic framework for multi-class brain tumor classification and segmentation using a sparse coding and dictionary learning method. The proposed framework consists of two steps: classification and segmentation. The classification of the brain tumors is based on brain topology and texture. The segmentation is based on voxel values of the image data. Using K-SVD, two types of dictionaries are learned from the training data and their associated ground truth segmentation: feature dictionary and voxel-wise coupled dictionaries. The feature dictionary consists of global image features (topological and texture features). The coupled dictionaries consist of coupled information: gray scale voxel values of the training image data and their associated label voxel values of the ground truth segmentation of the training data. For quantitative evaluation, the proposed framework is evaluated using different metrics. The segmentation results of the brain tumor segmentation (MICCAI-BraTS-2013) database are evaluated using five different metric scores, which are computed using the online evaluation tool provided by the BraTS-2013 challenge organizers. Experimental results demonstrate that the proposed approach achieves an accurate brain tumor classification and segmentation and outperforms the state-of-the-art methods.


Subject(s)
Brain Neoplasms/diagnostic imaging , Image Interpretation, Computer-Assisted/methods , Algorithms , Databases, Factual , Humans
10.
Biomed Tech (Berl) ; 61(4): 401-12, 2016 Aug 01.
Article in English | MEDLINE | ID: mdl-26501155

ABSTRACT

Automatic 3D liver segmentation is a fundamental step in the liver disease diagnosis and surgery planning. This paper presents a novel fully automatic algorithm for 3D liver segmentation in clinical 3D computed tomography (CT) images. Based on image features, we propose a new Mahalanobis distance cost function using an active shape model (ASM). We call our method MD-ASM. Unlike the standard active shape model (ST-ASM), the proposed method introduces a new feature-constrained Mahalanobis distance cost function to measure the distance between the generated shape during the iterative step and the mean shape model. The proposed Mahalanobis distance function is learned from a public database of liver segmentation challenge (MICCAI-SLiver07). As a refinement step, we propose the use of a 3D graph-cut segmentation. Foreground and background labels are automatically selected using texture features of the learned Mahalanobis distance. Quantitatively, the proposed method is evaluated using two clinical 3D CT scan databases (MICCAI-SLiver07 and MIDAS). The evaluation of the MICCAI-SLiver07 database is obtained by the challenge organizers using five different metric scores. The experimental results demonstrate the availability of the proposed method by achieving an accurate liver segmentation compared to the state-of-the-art methods.


Subject(s)
Imaging, Three-Dimensional/methods , Liver/diagnostic imaging , Tomography, X-Ray Computed , Algorithms , Databases, Factual , Humans , Models, Theoretical , Tomography, X-Ray Computed/methods
11.
Comput Methods Programs Biomed ; 137: 329-339, 2016 Dec.
Article in English | MEDLINE | ID: mdl-28110736

ABSTRACT

BACKGROUND AND OBJECTIVE: This paper presents a novel method for Alzheimer's disease classification via an automatic 3D caudate nucleus segmentation. METHODS: The proposed method consists of segmentation and classification steps. In the segmentation step, we propose a novel level set cost function. The proposed cost function is constrained by a sparse representation of local image features using a dictionary learning method. We present coupled dictionaries: a feature dictionary of a grayscale brain image and a label dictionary of a caudate nucleus label image. Using online dictionary learning, the coupled dictionaries are learned from the training data. The learned coupled dictionaries are embedded into a level set function. In the classification step, a region-based feature dictionary is built. The region-based feature dictionary is learned from shape features of the caudate nucleus in the training data. The classification is based on the measure of the similarity between the sparse representation of region-based shape features of the segmented caudate in the test image and the region-based feature dictionary. RESULTS: The experimental results demonstrate the superiority of our method over the state-of-the-art methods by achieving a high segmentation (91.5%) and classification (92.5%) accuracy. CONCLUSIONS: In this paper, we find that the study of the caudate nucleus atrophy gives an advantage over the study of whole brain structure atrophy to detect Alzheimer's disease.


Subject(s)
Alzheimer Disease/diagnosis , Automation , Caudate Nucleus/diagnostic imaging , Imaging, Three-Dimensional , Learning , Alzheimer Disease/classification , Humans
12.
Comput Med Imaging Graph ; 38(8): 725-34, 2014 Dec.
Article in English | MEDLINE | ID: mdl-24998760

ABSTRACT

Medical image segmentation and anatomical structure labeling according to the types of the tissues are important for accurate diagnosis and therapy. In this paper, we propose a novel approach for multi-region labeling and segmentation, which is based on a topological graph prior and the topological information of an atlas, using a modified multi-level set energy minimization method in brain images. We consider a topological graph prior and atlas information to evolve the contour based on a topological relationship presented via a graph relation. This novel method is capable of segmenting adjacent objects with very close gray level in low resolution brain image that would be difficult to segment correctly using standard methods. The topological information of an atlas are transformed to the topological graph of a low resolution (noisy) brain image to obtain region labeling. We explain our algorithm and show the topological graph prior and label transformation techniques to explain how it gives precise multi-region segmentation and labeling. The proposed algorithm is capable of segmenting and labeling different regions in noisy or low resolution MRI brain images of different modalities. We compare our approaches with other state-of-the-art approaches for multi-region labeling and segmentation.


Subject(s)
Brain/anatomy & histology , Documentation/methods , Image Interpretation, Computer-Assisted/methods , Magnetic Resonance Imaging/methods , Models, Anatomic , Pattern Recognition, Automated/methods , Subtraction Technique , Humans , Image Enhancement/methods , Sensitivity and Specificity , Terminology as Topic
SELECTION OF CITATIONS
SEARCH DETAIL
...