Pesquisa | Portal Regional da BVS

1.

Visualization for Trust in Machine Learning Revisited: The State of the Field in 2023.

Chatzimparmpas, Angelos; Kucher, Kostiantyn; Kerren, Andreas.

IEEE Comput Graph Appl ; 44(3): 99-113, 2024.

Artigo em Inglês | MEDLINE | ID: mdl-38294921

RESUMO

Visualization for explainable and trustworthy machine learning remains one of the most important and heavily researched fields within information visualization and visual analytics with various application domains, such as medicine, finance, and bioinformatics. After our 2020 state-of-the-art report comprising 200 techniques, we have persistently collected peer-reviewed articles describing visualization techniques, categorized them based on the previously established categorization schema consisting of 119 categories, and provided the resulting collection of 542 techniques in an online survey browser. In this survey article, we present the updated findings of new analyses of this dataset as of fall 2023 and discuss trends, insights, and eight open challenges for using visualizations in machine learning. Our results corroborate the rapidly growing trend of visualization techniques for increasing trust in machine learning models in the past three years, with visualization found to help improve popular model explainability methods and check new deep learning architectures, for instance.

2.

FeatureEnVi: Visual Analytics for Feature Engineering Using Stepwise Selection and Semi-Automatic Extraction Approaches.

Chatzimparmpas, Angelos; Martins, Rafael M; Kucher, Kostiantyn; Kerren, Andreas.

IEEE Trans Vis Comput Graph ; 28(4): 1773-1791, 2022 04.

Artigo em Inglês | MEDLINE | ID: mdl-34990365

RESUMO

The machine learning (ML) life cycle involves a series of iterative steps, from the effective gathering and preparation of the data-including complex feature engineering processes-to the presentation and improvement of results, with various algorithms to choose from in every step. Feature engineering in particular can be very beneficial for ML, leading to numerous improvements such as boosting the predictive results, decreasing computational times, reducing excessive noise, and increasing the transparency behind the decisions taken during the training. Despite that, while several visual analytics tools exist to monitor and control the different stages of the ML life cycle (especially those related to data and algorithms), feature engineering support remains inadequate. In this paper, we present FeatureEnVi, a visual analytics system specifically designed to assist with the feature engineering process. Our proposed system helps users to choose the most important feature, to transform the original features into powerful alternatives, and to experiment with different feature generation combinations. Additionally, data space slicing allows users to explore the impact of features on both local and global scales. FeatureEnVi utilizes multiple automatic feature selection techniques; furthermore, it visually guides users with statistical evidence about the influence of each feature (or subsets of features). The final outcome is the extraction of heavily engineered features, evaluated by multiple validation metrics. The usefulness and applicability of FeatureEnVi are demonstrated with two use cases and a case study. We also report feedback from interviews with two ML experts and a visualization researcher who assessed the effectiveness of our system.

Assuntos

Gráficos por Computador , Aprendizado de Máquina , Algoritmos

3.

StackGenVis: Alignment of Data, Algorithms, and Models for Stacking Ensemble Learning Using Performance Metrics.

Chatzimparmpas, Angelos; Martins, Rafael M; Kucher, Kostiantyn; Kerren, Andreas.

IEEE Trans Vis Comput Graph ; 27(2): 1547-1557, 2021 02.

Artigo em Inglês | MEDLINE | ID: mdl-33048687

RESUMO

In machine learning (ML), ensemble methods-such as bagging, boosting, and stacking-are widely-established approaches that regularly achieve top-notch predictive performance. Stacking (also called "stacked generalization") is an ensemble method that combines heterogeneous base models, arranged in at least one layer, and then employs another metamodel to summarize the predictions of those models. Although it may be a highly-effective approach for increasing the predictive performance of ML, generating a stack of models from scratch can be a cumbersome trial-and-error process. This challenge stems from the enormous space of available solutions, with different sets of data instances and features that could be used for training, several algorithms to choose from, and instantiations of these algorithms using diverse parameters (i.e., models) that perform differently according to various metrics. In this work, we present a knowledge generation model, which supports ensemble learning with the use of visualization, and a visual analytics system for stacked generalization. Our system, StackGenVis, assists users in dynamically adapting performance metrics, managing data instances, selecting the most important features for a given data set, choosing a set of top-performant and diverse algorithms, and measuring the predictive performance. In consequence, our proposed tool helps users to decide between distinct models and to reduce the complexity of the resulting stack by removing overpromising and underperforming models. The applicability and effectiveness of StackGenVis are demonstrated with two use cases: a real-world healthcare data set and a collection of data related to sentiment/stance detection in texts. Finally, the tool has been evaluated through interviews with three ML experts.

4.

BioVis Explorer: A visual guide for biological data visualization techniques.

Kerren, Andreas; Kucher, Kostiantyn; Li, Yuan-Fang; Schreiber, Falk.

PLoS One ; 12(11): e0187341, 2017.

Artigo em Inglês | MEDLINE | ID: mdl-29091942

RESUMO

Data visualization is of increasing importance in the Biosciences. During the past 15 years, a great number of novel methods and tools for the visualization of biological data have been developed and published in various journals and conference proceedings. As a consequence, keeping an overview of state-of-the-art visualization research has become increasingly challenging for both biology researchers and visualization researchers. To address this challenge, we have reviewed visualization research especially performed for the Biosciences and created an interactive web-based visualization tool, the BioVis Explorer. BioVis Explorer allows the exploration of published visualization methods in interactive and intuitive ways, including faceted browsing and associations with related methods. The tool is publicly available online and has been designed as community-based system which allows users to add their works easily.

Assuntos

Interface Usuário-Computador , Algoritmos , Armazenamento e Recuperação da Informação , Internet

5.

Visual analysis of online social media to open up the investigation of stance phenomena.

Kucher, Kostiantyn; Schamp-Bjerede, Teri; Kerren, Andreas; Paradis, Carita; Sahlgren, Magnus.

Inf Vis ; 15(2): 93-116, 2016 Apr.

Artigo em Inglês | MEDLINE | ID: mdl-29249903

RESUMO

Online social media are a perfect text source for stance analysis. Stance in human communication is concerned with speaker attitudes, beliefs, feelings and opinions. Expressions of stance are associated with the speakers' view of what they are talking about and what is up for discussion and negotiation in the intersubjective exchange. Taking stance is thus crucial for the social construction of meaning. Increased knowledge of stance can be useful for many application fields such as business intelligence, security analytics, or social media monitoring. In order to process large amounts of text data for stance analyses, linguists need interactive tools to explore the textual sources as well as the processed data based on computational linguistics techniques. Both original texts and derived data are important for refining the analyses iteratively. In this work, we present a visual analytics tool for online social media text data that can be used to open up the investigation of stance phenomena. Our approach complements traditional linguistic analysis techniques and is based on the analysis of utterances associated with two stance categories: sentiment and certainty. Our contributions include (1) the description of a novel web-based solution for analyzing the use and patterns of stance meanings and expressions in human communication over time; and (2) specialized techniques used for visualizing analysis provenance and corpus overview/navigation. We demonstrate our approach by means of text media on a highly controversial scandal with regard to expressions of anger and provide an expert review from linguists who have been using our tool.

RESUMO

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA