Pesquisa | Portal Regional da BVS

1.

Visualization for Trust in Machine Learning Revisited: The State of the Field in 2023.

Chatzimparmpas, Angelos; Kucher, Kostiantyn; Kerren, Andreas.

IEEE Comput Graph Appl ; 44(3): 99-113, 2024.

Artigo em Inglês | MEDLINE | ID: mdl-38294921

RESUMO

Visualization for explainable and trustworthy machine learning remains one of the most important and heavily researched fields within information visualization and visual analytics with various application domains, such as medicine, finance, and bioinformatics. After our 2020 state-of-the-art report comprising 200 techniques, we have persistently collected peer-reviewed articles describing visualization techniques, categorized them based on the previously established categorization schema consisting of 119 categories, and provided the resulting collection of 542 techniques in an online survey browser. In this survey article, we present the updated findings of new analyses of this dataset as of fall 2023 and discuss trends, insights, and eight open challenges for using visualizations in machine learning. Our results corroborate the rapidly growing trend of visualization techniques for increasing trust in machine learning models in the past three years, with visualization found to help improve popular model explainability methods and check new deep learning architectures, for instance.

2.

2D, 2.5D, or 3D? An Exploratory Study on Multilayer Network Visualisations in Virtual Reality.

Feyer, Stefan P; Pinaud, Bruno; Kobourov, Stephen; Brich, Nicolas; Krone, Michael; Kerren, Andreas; Behrisch, Michael; Schreiber, Falk; Klein, Karsten.

IEEE Trans Vis Comput Graph ; 30(1): 469-479, 2024 Jan.

Artigo em Inglês | MEDLINE | ID: mdl-37883262

RESUMO

Relational information between different types of entities is often modelled by a multilayer network (MLN) - a network with subnetworks represented by layers. The layers of an MLN can be arranged in different ways in a visual representation, however, the impact of the arrangement on the readability of the network is an open question. Therefore, we studied this impact for several commonly occurring tasks related to MLN analysis. Additionally, layer arrangements with a dimensionality beyond 2D, which are common in this scenario, motivate the use of stereoscopic displays. We ran a human subject study utilising a Virtual Reality headset to evaluate 2D, 2.5D, and 3D layer arrangements. The study employs six analysis tasks that cover the spectrum of an MLN task taxonomy, from path finding and pattern identification to comparisons between and across layers. We found no clear overall winner. However, we explore the task-to-arrangement space and derive empirical-based recommendations on the effective use of 2D, 2.5D, and 3D layer arrangements for MLNs.

3.

FeatureEnVi: Visual Analytics for Feature Engineering Using Stepwise Selection and Semi-Automatic Extraction Approaches.

Chatzimparmpas, Angelos; Martins, Rafael M; Kucher, Kostiantyn; Kerren, Andreas.

IEEE Trans Vis Comput Graph ; 28(4): 1773-1791, 2022 04.

Artigo em Inglês | MEDLINE | ID: mdl-34990365

RESUMO

The machine learning (ML) life cycle involves a series of iterative steps, from the effective gathering and preparation of the data-including complex feature engineering processes-to the presentation and improvement of results, with various algorithms to choose from in every step. Feature engineering in particular can be very beneficial for ML, leading to numerous improvements such as boosting the predictive results, decreasing computational times, reducing excessive noise, and increasing the transparency behind the decisions taken during the training. Despite that, while several visual analytics tools exist to monitor and control the different stages of the ML life cycle (especially those related to data and algorithms), feature engineering support remains inadequate. In this paper, we present FeatureEnVi, a visual analytics system specifically designed to assist with the feature engineering process. Our proposed system helps users to choose the most important feature, to transform the original features into powerful alternatives, and to experiment with different feature generation combinations. Additionally, data space slicing allows users to explore the impact of features on both local and global scales. FeatureEnVi utilizes multiple automatic feature selection techniques; furthermore, it visually guides users with statistical evidence about the influence of each feature (or subsets of features). The final outcome is the extraction of heavily engineered features, evaluated by multiple validation metrics. The usefulness and applicability of FeatureEnVi are demonstrated with two use cases and a case study. We also report feedback from interviews with two ML experts and a visualization researcher who assessed the effectiveness of our system.

Assuntos

Gráficos por Computador , Aprendizado de Máquina , Algoritmos

4.

Toward a Quantitative Survey of Dimension Reduction Techniques.

Espadoto, Mateus; Martins, Rafael M; Kerren, Andreas; Hirata, Nina S T; Telea, Alexandru C.

IEEE Trans Vis Comput Graph ; 27(3): 2153-2173, 2021 Mar.

Artigo em Inglês | MEDLINE | ID: mdl-31567092

RESUMO

Dimensionality reduction methods, also known as projections, are frequently used in multidimensional data exploration in machine learning, data science, and information visualization. Tens of such techniques have been proposed, aiming to address a wide set of requirements, such as ability to show the high-dimensional data structure, distance or neighborhood preservation, computational scalability, stability to data noise and/or outliers, and practical ease of use. However, it is far from clear for practitioners how to choose the best technique for a given use context. We present a survey of a wide body of projection techniques that helps answering this question. For this, we characterize the input data space, projection techniques, and the quality of projections, by several quantitative metrics. We sample these three spaces according to these metrics, aiming at good coverage with bounded effort. We describe our measurements and outline observed dependencies of the measured variables. Based on these results, we draw several conclusions that help comparing projection techniques, explain their results for different types of data, and ultimately help practitioners when choosing a projection for a given context. Our methodology, datasets, projection implementations, metrics, visualizations, and results are publicly open, so interested stakeholders can examine and/or extend this benchmark.

5.

StackGenVis: Alignment of Data, Algorithms, and Models for Stacking Ensemble Learning Using Performance Metrics.

Chatzimparmpas, Angelos; Martins, Rafael M; Kucher, Kostiantyn; Kerren, Andreas.

IEEE Trans Vis Comput Graph ; 27(2): 1547-1557, 2021 02.

Artigo em Inglês | MEDLINE | ID: mdl-33048687

RESUMO

In machine learning (ML), ensemble methods-such as bagging, boosting, and stacking-are widely-established approaches that regularly achieve top-notch predictive performance. Stacking (also called "stacked generalization") is an ensemble method that combines heterogeneous base models, arranged in at least one layer, and then employs another metamodel to summarize the predictions of those models. Although it may be a highly-effective approach for increasing the predictive performance of ML, generating a stack of models from scratch can be a cumbersome trial-and-error process. This challenge stems from the enormous space of available solutions, with different sets of data instances and features that could be used for training, several algorithms to choose from, and instantiations of these algorithms using diverse parameters (i.e., models) that perform differently according to various metrics. In this work, we present a knowledge generation model, which supports ensemble learning with the use of visualization, and a visual analytics system for stacked generalization. Our system, StackGenVis, assists users in dynamically adapting performance metrics, managing data instances, selecting the most important features for a given data set, choosing a set of top-performant and diverse algorithms, and measuring the predictive performance. In consequence, our proposed tool helps users to decide between distinct models and to reduce the complexity of the resulting stack by removing overpromising and underperforming models. The applicability and effectiveness of StackGenVis are demonstrated with two use cases: a real-world healthcare data set and a collection of data related to sentiment/stance detection in texts. Finally, the tool has been evaluated through interviews with three ML experts.

6.

t-viSNE: Interactive Assessment and Interpretation of t-SNE Projections.

Chatzimparmpas, Angelos; Martins, Rafael M; Kerren, Andreas.

IEEE Trans Vis Comput Graph ; 26(8): 2696-2714, 2020 Aug.

Artigo em Inglês | MEDLINE | ID: mdl-32305922

RESUMO

t-Distributed Stochastic Neighbor Embedding (t-SNE) for the visualization of multidimensional data has proven to be a popular approach, with successful applications in a wide range of domains. Despite their usefulness, t-SNE projections can be hard to interpret or even misleading, which hurts the trustworthiness of the results. Understanding the details of t-SNE itself and the reasons behind specific patterns in its output may be a daunting task, especially for non-experts in dimensionality reduction. In this article, we present t-viSNE, an interactive tool for the visual exploration of t-SNE projections that enables analysts to inspect different aspects of their accuracy and meaning, such as the effects of hyper-parameters, distance and neighborhood preservation, densities and costs of specific neighborhoods, and the correlations between dimensions and visual patterns. We propose a coherent, accessible, and well-integrated collection of different views for the visualization of t-SNE projections. The applicability and usability of t-viSNE are demonstrated through hypothetical usage scenarios with real data sets. Finally, we present the results of a user study where the tool's effectiveness was evaluated. By bringing to light information that would normally be lost after running t-SNE, we hope to support analysts in using t-SNE and making its results better understandable.

7.

Finding Reasons for Vaccination Hesitancy: Evaluating Semi-Automatic Coding of Internet Discussion Forums.

Skeppstedt, Maria; Kerren, Andreas; Stede, Manfred.

Stud Health Technol Inform ; 264: 348-352, 2019 Aug 21.

Artigo em Inglês | MEDLINE | ID: mdl-31437943

RESUMO

Computer-assisted text coding can facilitate the analysis of large text collections. To evaluate the functionality of providing an analyst with a ranked list of suggestions for suitable text codes, we used a data set of discussion posts, which had been manually coded for reasons given for taking a stance on the topic of vaccination. We trained a logistic regression classifier to rank these reasons according to the probability that they would be present in the post. The approach was evaluated for its ability to include the expected reasons among the n top-ranked reasons, using an n between 1 and 6. The logistic regression-based ranking was more effective than the baseline, which ranked reasons according to their frequency in the training data. Providing such a list of possible codes, ranked by logistic regression, could therefore be a useful feature in a tool for text coding.

Assuntos

Vacinação , Internet , Mídias Sociais

8.

Vaccine Hesitancy in Discussion Forums: Computer-Assisted Argument Mining with Topic Models.

Skeppstedt, Maria; Kerren, Andreas; Stede, Manfred.

Stud Health Technol Inform ; 247: 366-370, 2018.

Artigo em Inglês | MEDLINE | ID: mdl-29677984

RESUMO

Arguments used when vaccination is debated on Internet discussion forums might give us valuable insights into reasons behind vaccine hesitancy. In this study, we applied automatic topic modelling on a collection of 943 discussion posts in which vaccine was debated, and six distinct discussion topics were detected by the algorithm. When manually coding the posts ranked as most typical for these six topics, a set of semantically coherent arguments were identified for each extracted topic. This indicates that topic modelling is a useful method for automatically identifying vaccine-related discussion topics and for identifying debate posts where these topics are discussed. This functionality could facilitate manual coding of salient arguments, and thereby form an important component in a system for computer-assisted coding of vaccine-related discussions.

Assuntos

Mineração de Dados , Internet , Recusa de Vacinação , Vacinas , Vacinação

9.

BioVis Explorer: A visual guide for biological data visualization techniques.

Kerren, Andreas; Kucher, Kostiantyn; Li, Yuan-Fang; Schreiber, Falk.

PLoS One ; 12(11): e0187341, 2017.

Artigo em Inglês | MEDLINE | ID: mdl-29091942

RESUMO

Data visualization is of increasing importance in the Biosciences. During the past 15 years, a great number of novel methods and tools for the visualization of biological data have been developed and published in various journals and conference proceedings. As a consequence, keeping an overview of state-of-the-art visualization research has become increasingly challenging for both biology researchers and visualization researchers. To address this challenge, we have reviewed visualization research especially performed for the Biosciences and created an interactive web-based visualization tool, the BioVis Explorer. BioVis Explorer allows the exploration of published visualization methods in interactive and intuitive ways, including faceted browsing and associations with related methods. The tool is publicly available online and has been designed as community-based system which allows users to add their works easily.

Assuntos

Interface Usuário-Computador , Algoritmos , Armazenamento e Recuperação da Informação , Internet

10.

MobilityGraphs: Visual Analysis of Mass Mobility Dynamics via Spatio-Temporal Graphs and Clustering.

von Landesberger, Tatiana; Brodkorb, Felix; Roskosch, Philipp; Andrienko, Natalia; Andrienko, Gennady; Kerren, Andreas.

IEEE Trans Vis Comput Graph ; 22(1): 11-20, 2016 Jan.

Artigo em Inglês | MEDLINE | ID: mdl-26529684

RESUMO

Learning more about people mobility is an important task for official decision makers and urban planners. Mobility data sets characterize the variation of the presence of people in different places over time as well as movements (or flows) of people between the places. The analysis of mobility data is challenging due to the need to analyze and compare spatial situations (i.e., presence and flows of people at certain time moments) and to gain an understanding of the spatio-temporal changes (variations of situations over time). Traditional flow visualizations usually fail due to massive clutter. Modern approaches offer limited support for investigating the complex variation of the movements over longer time periods.

11.

Visual analysis of online social media to open up the investigation of stance phenomena.

Kucher, Kostiantyn; Schamp-Bjerede, Teri; Kerren, Andreas; Paradis, Carita; Sahlgren, Magnus.

Inf Vis ; 15(2): 93-116, 2016 Apr.

Artigo em Inglês | MEDLINE | ID: mdl-29249903

RESUMO

Online social media are a perfect text source for stance analysis. Stance in human communication is concerned with speaker attitudes, beliefs, feelings and opinions. Expressions of stance are associated with the speakers' view of what they are talking about and what is up for discussion and negotiation in the intersubjective exchange. Taking stance is thus crucial for the social construction of meaning. Increased knowledge of stance can be useful for many application fields such as business intelligence, security analytics, or social media monitoring. In order to process large amounts of text data for stance analyses, linguists need interactive tools to explore the textual sources as well as the processed data based on computational linguistics techniques. Both original texts and derived data are important for refining the analyses iteratively. In this work, we present a visual analytics tool for online social media text data that can be used to open up the investigation of stance phenomena. Our approach complements traditional linguistic analysis techniques and is based on the analysis of utterances associated with two stance categories: sentiment and certainty. Our contributions include (1) the description of a novel web-based solution for analyzing the use and patterns of stance meanings and expressions in human communication over time; and (2) specialized techniques used for visualizing analysis provenance and corpus overview/navigation. We demonstrate our approach by means of text media on a highly controversial scandal with regard to expressions of anger and provide an expert review from linguists who have been using our tool.

12.

Why integrate InfoVis and SciVis?: An example from systems biology.

Kerren, Andreas; Schreiber, Falk.

IEEE Comput Graph Appl ; 34(6): 69-73, 2014.

Artigo em Inglês | MEDLINE | ID: mdl-25388233

RESUMO

The more-or-less artificial barrier between information visualization and scientific visualization hinders knowledge discovery. Having an integrated view of many aspects of the target data, including a seamlessly interwoven visual display of structural abstract data and 3D spatial information, could lead to new discoveries, insights, and scientific questions. Such a view also could reduce the user's cognitive load--that is, reduce the effort the user expends when comparing views.

Assuntos

Biologia de Sistemas , Sequência de Aminoácidos , Dados de Sequência Molecular

13.

Information visualization courses for students with a computer science background.

Kerren, Andreas.

IEEE Comput Graph Appl ; 33(2): 12-5, 2013.

Artigo em Inglês | MEDLINE | ID: mdl-24807935

RESUMO

Linnaeus University offers two master's courses in information visualization for computer science students with programming experience. This article briefly describes the syllabi, exercises, and practices developed for these courses.

Assuntos

Gráficos por Computador , Computadores , Currículo , Estudantes , Avaliação Educacional , Humanos

14.

Visualization of particle interactions in granular media.

Meier, Holger A; Schlemmer, Michael; Wagner, Christian; Kerren, Andreas; Hagen, Hans; Kuhl, Ellen; Steinmann, Paul.

IEEE Trans Vis Comput Graph ; 14(5): 1110-25, 2008.

Artigo em Inglês | MEDLINE | ID: mdl-18599921

RESUMO

Interaction between particles in so-called granular media, such as soil and sand, plays an important role in the context of geomechanical phenomena and numerous industrial applications. A two scale homogenization approach based on a micro and a macro scale level is briefly introduced in this paper. Computation of granular material in such a way gives a deeper insight into the context of discontinuous materials and at the same time reduces the computational costs. However, the description and the understanding of the phenomena in granular materials are not yet satisfactory. A sophisticated problem-specific visualization technique would significantly help to illustrate failure phenomena on the microscopic level. As main contribution, we present a novel 2D approach for the visualization of simulation data, based on the above outlined homogenization technique. Our visualization tool supports visualization on micro scale level as well as on macro scale level. The tool shows both aspects closely arranged in form of multiple coordinated views to give users the possibility to analyze the particle behavior effectively. A novel type of interactive rose diagrams was developed to represent the dynamic contact networks on the micro scale level in a condensed and efficient way.

Assuntos

Algoritmos , Coloides/química , Gráficos por Computador , Imageamento Tridimensional/métodos , Modelos Químicos , Pós/química , Interface Usuário-Computador , Simulação por Computador , Tamanho da Partícula , Solo

RESUMO

RESUMO

RESUMO

Assuntos

RESUMO

RESUMO

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA