Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 6 de 6
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
IEEE Trans Vis Comput Graph ; 30(1): 694-704, 2024 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-37871071

RESUMO

Open-world object detection (OWOD) is an emerging computer vision problem that involves not only the identification of predefined object classes, like what general object detectors do, but also detects new unknown objects simultaneously. Recently, several end-to-end deep learning models have been proposed to address the OWOD problem. However, these approaches face several challenges: a) significant changes in both network architecture and training procedure are required; b) they are trained from scratch, which can not leverage existing pre-trained general detectors; c) costly annotations for all unknown classes are needed. To overcome these challenges, we present a visual analytic framework called OW-Adapter. It acts as an adaptor to enable pre-trained general object detectors to handle the OWOD problem. Specifically, OW-Adapter is designed to identify, summarize, and annotate unknown examples with minimal human effort. Moreover, we introduce a lightweight classifier to learn newly annotated unknown classes and plug the classifier into pre-trained general detectors to detect unknown objects. We demonstrate the effectiveness of our framework through two case studies of different domains, including common object recognition and autonomous driving. The studies show that a simple yet powerful adaptor can extend the capability of pre-trained general detectors to detect unknown objects and improve the performance on known classes simultaneously.

2.
J Comput Soc Sci ; 5(2): 1257-1279, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-35602668

RESUMO

VisualCommunity is a platform designed to support community or neighborhood scale research. The platform integrates mobile, AI, visualization techniques, along with tools to help domain researchers, practitioners, and students collecting and working with spatialized video and geo-narratives. These data, which provide granular spatialized imagery and associated context gained through expert commentary have previously provided value in understanding various community-scale challenges. This paper further enhances this work AI-based image processing and speech transcription tools available in VisualCommunity, allowing for the easy exploration of the acquired semantic and visual information about the area under investigation. In this paper we describe the specific advances through use case examples including COVID-19 related scenarios.

3.
IEEE Trans Vis Comput Graph ; 28(1): 1019-1029, 2022 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-34596546

RESUMO

Vision-based deep learning (DL) methods have made great progress in learning autonomous driving models from large-scale crowd-sourced video datasets. They are trained to predict instantaneous driving behaviors from video data captured by on-vehicle cameras. In this paper, we develop a geo-context aware visualization system for the study of Autonomous Driving Model (ADM) predictions together with large-scale ADM video data. The visual study is seamlessly integrated with the geographical environment by combining DL model performance with geospatial visualization techniques. Model performance measures can be studied together with a set of geospatial attributes over map views. Users can also discover and compare prediction behaviors of multiple DL models in both city-wide and street-level analysis, together with road images and video contents. Therefore, the system provides a new visual exploration platform for DL model designers in autonomous driving. Use cases and domain expert evaluation show the utility and effectiveness of the visualization system.

4.
J Comput Soc Sci ; 4(2): 813-837, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-33718652

RESUMO

The complex interrelationship between the built environment and social problems is often described but frequently lacks the data and analytical framework to explore the potential of such a relationship in different applications. We address this gap using a machine learning (ML) approach to study whether street-level built environment visuals can be used to classify locations with high-crime and lower-crime activities. For training the ML model, spatialized expert narratives are used to label different locations. Semantic categories (e.g., road, sky, greenery, etc.) are extracted from Google Street View (GSV) images of those locations through a deep learning image segmentation algorithm. From these, local visual representatives are generated and used to train the classification model. The model is applied to two cities in the U.S. to predict the locations as being linked to high crime. Results show our model can predict high- and lower-crime areas with high accuracies (above 98% and 95% in first and second test cities, accordingly).

5.
IEEE Comput Graph Appl ; 41(2): 49-62, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-32078538

RESUMO

Community-level event (CLE) datasets, such as police reports of crime events, contain abundant semantic information of event situations, and descriptions in a geospatial-temporal context. They are critical for frontline users, such as police officers and social workers, to discover and examine insights about community neighborhoods. We propose CLEVis, a neighborhood visual analytics system for CLE datasets, to help frontline users explore events for insights at community regions of interest, namely fine-grained geographical resolutions, such as small neighborhoods around local restaurants, churches, and schools. CLEVis fully utilizes semantic information by integrating automatic algorithms and interactive visualizations. The design and development of CLEVis are conducted with solid collaborations with real-world community workers and social scientists. Case studies and user feedback are presented with real-world datasets and applications.

6.
IEEE Trans Vis Comput Graph ; 27(2): 1312-1321, 2021 02.
Artigo em Inglês | MEDLINE | ID: mdl-33104509

RESUMO

Existing interactive visualization tools for deep learning are mostly applied to the training, debugging, and refinement of neural network models working on natural images. However, visual analytics tools are lacking for the specific application of x-ray image classification with multiple structural attributes. In this paper, we present an interactive system for domain scientists to visually study the multiple attributes learning models applied to x-ray scattering images. It allows domain scientists to interactively explore this important type of scientific images in embedded spaces that are defined on the model prediction output, the actual labels, and the discovered feature space of neural networks. Users are allowed to flexibly select instance images, their clusters, and compare them regarding the specified visual representation of attributes. The exploration is guided by the manifestation of model performance related to mutual relationships among attributes, which often affect the learning accuracy and effectiveness. The system thus supports domain scientists to improve the training dataset and model, find questionable attributes labels, and identify outlier images or spurious data clusters. Case studies and scientists feedback demonstrate its functionalities and usefulness.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...