Search | VHL Regional Portal

ITree: a user-driven tool for interactive decision-making with classification trees.

Sokolowski, Hubert; Czajkowski, Marcin; Czajkowska, Anna; Jurczuk, Krzysztof; Kretowski, Marek.

Bioinformatics ; 40(5)2024 May 02.

Article in English | MEDLINE | ID: mdl-38640482

ABSTRACT

MOTIVATION: ITree is an intuitive web tool for the manual, semi-automatic, and automatic induction of decision trees. It enables interactive modifications of tree structures and incorporates Relative Expression Analysis for detecting complex patterns in high-throughput molecular data. This makes ITree a versatile tool for both research and education in biomedical data analysis. RESULTS: The tool allows users to instantly see the effects of modifications on decision trees, with updates to predictions and statistics displayed in real time, facilitating a deeper understanding of data classification processes. AVAILABILITY AND IMPLEMENTATION: Available online at https://itree.wi.pb.edu.pl. Source code and documentation are hosted on GitHub at https://github.com/hsokolowski/iTree and in supplement.

Subject(s)

Decision Trees , Software , Computational Biology/methods , Algorithms

The FCGR2A Is Associated with the Presence of Atherosclerotic Plaques in the Carotid Arteries-A Case-Control Study.

Szpakowicz, Anna; Szum-Jakubowska, Aleksandra; Lisowska, Anna; Dubatówka, Marlena; Raczkowski, Andrzej; Czajkowski, Marcin; Szczerbinski, Lukasz; Chlabicz, Malgorzata; Kretowski, Adam; Kaminski, Karol Adam.

J Clin Med ; 12(20)2023 Oct 12.

Article in English | MEDLINE | ID: mdl-37892617

ABSTRACT

BACKGROUND: Atherosclerotic plaques in carotid arteries (APCA) are a prevalent condition with severe potential complications. Studies continuously search for innovative biomarkers for APCA, including those participating in cellular metabolic processes, cell adhesion, immune response, and complement activation. This study aimed to assess the relationship between APCA presence and a broad range of cardiometabolic biomarkers in the general population. METHODS: The study group consisted of consecutive participants of the population study Bialystok PLUS. The proximity extension assay (PEA) technique from the Olink Laboratory (Uppsala, Sweden) was used to measure the levels of 92 cardiometabolic biomarkers. RESULTS: The study comprised 693 participants (mean age 48.78 ± 15.27 years, 43.4% males, N = 301). APCA was identified in 46.2% of the participants (N = 320). Of the 92 biomarkers that were investigated, 54 were found to be significantly linked to the diagnosis of APCA. After adjusting for the traditional risk factors for atherosclerosis in multivariate analysis, the only biomarker that remained significantly associated with APCA was FCGR2A. CONCLUSION: In the general population, the prevalence of APCA is very high. A range of biomarkers are linked with APCA. Nonetheless, the majority of these associations are explained by traditional risk factors for atherosclerosis. The only biomarker that was independently associated with APCA was the FCGR2A.

A comparison of different machine-learning techniques for the selection of a panel of metabolites allowing early detection of brain tumors.

Godlewski, Adrian; Czajkowski, Marcin; Mojsak, Patrycja; Pienkowski, Tomasz; Gosk, Wioleta; Lyson, Tomasz; Mariak, Zenon; Reszec, Joanna; Kondraciuk, Marcin; Kaminski, Karol; Kretowski, Marek; Moniuszko, Marcin; Kretowski, Adam; Ciborowski, Michal.

Sci Rep ; 13(1): 11044, 2023 07 08.

Article in English | MEDLINE | ID: mdl-37422554

ABSTRACT

Metabolomics combined with machine learning methods (MLMs), is a powerful tool for searching novel diagnostic panels. This study was intended to use targeted plasma metabolomics and advanced MLMs to develop strategies for diagnosing brain tumors. Measurement of 188 metabolites was performed on plasma samples collected from 95 patients with gliomas (grade I-IV), 70 with meningioma, and 71 healthy individuals as a control group. Four predictive models to diagnose glioma were prepared using 10 MLMs and a conventional approach. Based on the cross-validation results of the created models, the F1-scores were calculated, then obtained values were compared. Subsequently, the best algorithm was applied to perform five comparisons involving gliomas, meningiomas, and controls. The best results were obtained using the newly developed hybrid evolutionary heterogeneous decision tree (EvoHDTree) algorithm, which was validated using Leave-One-Out Cross-Validation, resulting in an F1-score for all comparisons in the range of 0.476-0.948 and the area under the ROC curves ranging from 0.660 to 0.873. Brain tumor diagnostic panels were constructed with unique metabolites, which reduces the likelihood of misdiagnosis. This study proposes a novel interdisciplinary method for brain tumor diagnosis based on metabolomics and EvoHDTree, exhibiting significant predictive coefficients.

Subject(s)

Brain Neoplasms , Glioma , Meningeal Neoplasms , Meningioma , Humans , Brain Neoplasms/diagnosis , Brain Neoplasms/pathology , Glioma/pathology , Brain/metabolism , Meningioma/diagnosis , Meningioma/pathology , Machine Learning

Testing the Utility of Polygenic Risk Scores for Type 2 Diabetes and Obesity in Predicting Metabolic Changes in a Prediabetic Population: An Observational Study.

Padilla-Martinez, Felipe; Szczerbinski, Lukasz; Citko, Anna; Czajkowski, Marcin; Konopka, Paulina; Paszko, Adam; Wawrusiewicz-Kurylonek, Natalia; Górska, Maria; Kretowski, Adam.

Int J Mol Sci ; 23(24)2022 Dec 16.

Article in English | MEDLINE | ID: mdl-36555722

ABSTRACT

Prediabetes is an intermediate state of hyperglycemia during which glycemic parameters are above normal levels but below the T2D threshold. T2D and its precursor prediabetes affect 6.28% and 7.3% of the world's population, respectively. The main objective of this paper was to create and compare two polygenic risk scores (PRSs) versus changes over time (Δ) in metabolic parameters related to prediabetes and metabolic complications. The genetics of 446 prediabetic patients from the Polish Registry of Diabetes cohort were investigated. Seventeen metabolic parameters were measured and compared at baseline and after five years using statistical analysis. Subsequently, genetic polymorphisms present in patients were determined to build a T2D PRS (68 SNPs) and an obesity PRS (21 SNPs). Finally, the association among the two PRSs and the Δ of the metabolic traits was assessed. After a multiple linear regression with adjustment for age, sex, and BMI at a nominal significance of (p < 0.05) and adjustment for multiple testing, the T2D PRS was found to be positively associated with Δ fat mass (FM) (p = 0.025). The obesity PRS was positively associated with Δ FM (p = 0.023) and Δ 2 h glucose (p = 0.034). The comparison of genotype frequencies showed that AA genotype carriers of rs10838738 were significantly higher in Δ 2 h glucose and in Δ 2 h insulin. Our findings suggest that prediabetic individuals with a higher risk of developing T2D experience increased Δ FM, and those with a higher risk of obesity experience increased Δ FM and Δ two-hour postprandial glucose. The associations found in this research could be a powerful tool for identifying prediabetic individuals with an increased risk of developing T2D and obesity.

Subject(s)

Diabetes Mellitus, Type 2 , Obesity , Prediabetic State , Humans , Body Mass Index , Diabetes Mellitus, Type 2/complications , Diabetes Mellitus, Type 2/genetics , Glucose , Obesity/complications , Obesity/genetics , Prediabetic State/complications , Prediabetic State/genetics , Risk Factors , Multifactorial Inheritance

Evolutionary approach for relative gene expression algorithms.

Czajkowski, Marcin; Kretowski, Marek.

ScientificWorldJournal ; 2014: 593503, 2014.

Article in English | MEDLINE | ID: mdl-24790574

ABSTRACT

A Relative Expression Analysis (RXA) uses ordering relationships in a small collection of genes and is successfully applied to classiffication using microarray data. As checking all possible subsets of genes is computationally infeasible, the RXA algorithms require feature selection and multiple restrictive assumptions. Our main contribution is a specialized evolutionary algorithm (EA) for top-scoring pairs called EvoTSP which allows finding more advanced gene relations. We managed to unify the major variants of relative expression algorithms through EA and introduce weights to the top-scoring pairs. Experimental validation of EvoTSP on public available microarray datasets showed that the proposed solution significantly outperforms in terms of accuracy other relative expression algorithms and allows exploring much larger solution space.

Subject(s)

Algorithms , Computational Biology/methods , Evolution, Molecular , Gene Expression Profiling/methods , Oligonucleotide Array Sequence Analysis/methods , Gene Expression Profiling/classification , Gene Expression Profiling/statistics & numerical data , Genetic Fitness , Genetic Variation , Mutation , Oligonucleotide Array Sequence Analysis/classification , Oligonucleotide Array Sequence Analysis/statistics & numerical data , Recombination, Genetic , Selection, Genetic

Multi-test decision tree and its application to microarray data classification.

Czajkowski, Marcin; Grzes, Marek; Kretowski, Marek.

Artif Intell Med ; 61(1): 35-44, 2014 May.

Article in English | MEDLINE | ID: mdl-24630712

ABSTRACT

OBJECTIVE: The desirable property of tools used to investigate biological data is easy to understand models and predictive decisions. Decision trees are particularly promising in this regard due to their comprehensible nature that resembles the hierarchical process of human decision making. However, existing algorithms for learning decision trees have tendency to underfit gene expression data. The main aim of this work is to improve the performance and stability of decision trees with only a small increase in their complexity. METHODS: We propose a multi-test decision tree (MTDT); our main contribution is the application of several univariate tests in each non-terminal node of the decision tree. We also search for alternative, lower-ranked features in order to obtain more stable and reliable predictions. RESULTS: Experimental validation was performed on several real-life gene expression datasets. Comparison results with eight classifiers show that MTDT has a statistically significantly higher accuracy than popular decision tree classifiers, and it was highly competitive with ensemble learning algorithms. The proposed solution managed to outperform its baseline algorithm on 14 datasets by an average 6%. A study performed on one of the datasets showed that the discovered genes used in the MTDT classification model are supported by biological evidence in the literature. CONCLUSION: This paper introduces a new type of decision tree which is more suitable for solving biological problems. MTDTs are relatively easy to analyze and much more powerful in modeling high dimensional microarray data than their popular counterparts.

Subject(s)

Computational Biology/methods , Decision Trees , Gene Expression Profiling , Microarray Analysis , Algorithms , Datasets as Topic , Humans

Top scoring pair decision tree for gene expression data analysis.

Adv Exp Med Biol ; 696: 27-35, 2011.

Article in English | MEDLINE | ID: mdl-21431543

ABSTRACT

Classification problems of microarray data may be successfully performed with approaches by human experts which are easy to understand and interpret, like decision trees or Top Scoring Pairs algorithms. In this chapter, we propose a hybrid solution that combines the above-mentioned methods. An application of presented decision trees, which splits instances based on pairwise comparisons of the gene expression values, may have considerable potential for genomic research and scientific modeling of underlying processes. We have compared proposed solution with the TSP-family methods and decision trees on 11 public domain microarray datasets and the results are promising.

Subject(s)

Algorithms , Decision Trees , Gene Expression Profiling/statistics & numerical data , Oligonucleotide Array Sequence Analysis/statistics & numerical data , Breast Neoplasms/genetics , DNA, Neoplasm/genetics , Data Interpretation, Statistical , Databases, Nucleic Acid , Female , Humans

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL