Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 10 de 10
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
Bioinformatics ; 39(9)2023 09 02.
Artigo em Inglês | MEDLINE | ID: mdl-37672022

RESUMO

MOTIVATION: Genome-wide association studies (GWAS) present several computational and statistical challenges for their data analysis, including knowledge discovery, interpretability, and translation to clinical practice. RESULTS: We develop, apply, and comparatively evaluate an automated machine learning (AutoML) approach, customized for genomic data that delivers reliable predictive and diagnostic models, the set of genetic variants that are important for predictions (called a biosignature), and an estimate of the out-of-sample predictive power. This AutoML approach discovers variants with higher predictive performance compared to standard GWAS methods, computes an individual risk prediction score, generalizes to new, unseen data, is shown to better differentiate causal variants from other highly correlated variants, and enhances knowledge discovery and interpretability by reporting multiple equivalent biosignatures. AVAILABILITY AND IMPLEMENTATION: Code for this study is available at: https://github.com/mensxmachina/autoML-GWAS. JADBio offers a free version at: https://jadbio.com/sign-up/. SNP data can be downloaded from the EGA repository (https://ega-archive.org/). PRS data are found at: https://www.aicrowd.com/challenges/opensnp-height-prediction. Simulation data to study population structure can be found at: https://easygwas.ethz.ch/data/public/dataset/view/1/.


Assuntos
Estudo de Associação Genômica Ampla , Polimorfismo de Nucleotídeo Único , Humanos , Fenótipo , Simulação por Computador , Aprendizado de Máquina
2.
NPJ Precis Oncol ; 6(1): 38, 2022 Jun 16.
Artigo em Inglês | MEDLINE | ID: mdl-35710826

RESUMO

Fully automated machine learning (AutoML) for predictive modeling is becoming a reality, giving rise to a whole new field. We present the basic ideas and principles of Just Add Data Bio (JADBio), an AutoML platform applicable to the low-sample, high-dimensional omics data that arise in translational medicine and bioinformatics applications. In addition to predictive and diagnostic models ready for clinical use, JADBio focuses on knowledge discovery by performing feature selection and identifying the corresponding biosignatures, i.e., minimal-size subsets of biomarkers that are jointly predictive of the outcome or phenotype of interest. It also returns a palette of useful information for interpretation, clinical use of the models, and decision making. JADBio is qualitatively and quantitatively compared against Hyper-Parameter Optimization Machine Learning libraries. Results show that in typical omics dataset analysis, JADBio manages to identify signatures comprising of just a handful of features while maintaining competitive predictive performance and accurate out-of-sample performance estimation.

3.
IEEE/ACM Trans Comput Biol Bioinform ; 19(2): 1214-1224, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-33035156

RESUMO

Feature selection for predictive analytics is the problem of identifying a minimal-size subset of features that is maximally predictive of an outcome of interest. To apply to molecular data, feature selection algorithms need to be scalable to tens of thousands of features. In this paper, we propose γ-OMP, a generalisation of the highly-scalable Orthogonal Matching Pursuit feature selection algorithm. γ-OMP can handle (a)various types of outcomes, such as continuous, binary, nominal, time-to-event, (b)discrete (categorical)features, (c)different statistical-based stopping criteria, (d)several predictive models (e.g., linear or logistic regression), (e)various types of residuals, and (f)different types of association. We compare γ-OMP against LASSO, a prototypical, widely used algorithm for high-dimensional data. On both simulated data and several real gene expression datasets, γ-OMP is on par, or outperforms LASSO in binary classification (case-control data), regression (quantified outcomes), and time-to-event data (censored survival times). γ-OMP is based on simple statistical ideas, it is easy to implement and to extend, and our extensive evaluation shows that it is also effective in bioinformatics analysis settings.


Assuntos
Algoritmos , Biologia Computacional , Estudos de Casos e Controles , Expressão Gênica , Modelos Logísticos
4.
J Steroid Biochem Mol Biol ; 197: 105505, 2020 03.
Artigo em Inglês | MEDLINE | ID: mdl-31669573

RESUMO

Vitamin D (VitD) continues to trigger intense scientific controversy, regarding both its bi ological targets and its supplementation doses and regimens. In an effort to resolve this dispute, we mapped VitD transcriptome-wide events in humans, in order to unveil shared patterns or mechanisms with diverse pathologies/tissue profiles and reveal causal effects between VitD actions and specific human diseases, using a recently developed bioinformatics methodology. Using the similarities in analyzed transcriptome data (c-SKL method), we validated our methodology with osteoporosis as an example and further analyzed two other strong hits, specifically chronic obstructive pulmonary disease (COPD) and asthma. The latter revealed no impact of VitD on known molecular pathways. In accordance to this finding, review and meta-analysis of published data, based on an objective measure (Forced Expiratory Volume at one second, FEV1%) did not further reveal any significant effect of VitD on the objective amelioration of either condition. This study may, therefore, be regarded as the first one to explore, in an objective, unbiased and unsupervised manner, the impact of VitD levels and/or interventions in a number of human pathologies.


Assuntos
Asma/sangue , Biologia Computacional/métodos , Doença Pulmonar Obstrutiva Crônica/sangue , Transcriptoma , Deficiência de Vitamina D/sangue , Vitamina D/sangue , Vitaminas/sangue , Asma/complicações , Asma/genética , Suplementos Nutricionais , Humanos , Doença Pulmonar Obstrutiva Crônica/complicações , Doença Pulmonar Obstrutiva Crônica/genética , Vitamina D/genética , Deficiência de Vitamina D/complicações , Deficiência de Vitamina D/genética , Vitaminas/genética
5.
NPJ Syst Biol Appl ; 5: 39, 2019.
Artigo em Inglês | MEDLINE | ID: mdl-31666984

RESUMO

Could there be unexpected similarities between different studies, diseases, or treatments, on a molecular level due to common biological mechanisms involved? To answer this question, we develop a method for computing similarities between empirical, statistical distributions of high-dimensional, low-sample datasets, and apply it on hundreds of -omics studies. The similarities lead to dataset-to-dataset networks visualizing the landscape of a large portion of biological data. Potentially interesting similarities connecting studies of different diseases are assembled in a disease-to-disease network. Exploring it, we discover numerous non-trivial connections between Alzheimer's disease and schizophrenia, asthma and psoriasis, or liver cancer and obesity, to name a few. We then present a method that identifies the molecular quantities and pathways that contribute the most to the identified similarities and could point to novel drug targets or provide biological insights. The proposed method acts as a "statistical telescope" providing a global view of the constellation of biological data; readers can peek through it at: http://datascope.csd.uoc.gr:25000/.


Assuntos
Biologia Computacional/métodos , Métodos Epidemiológicos , Algoritmos , Análise de Dados , Bases de Dados Factuais , Bases de Dados Genéticas , Doença/genética , Epidemiologia , Humanos , Modelos Estatísticos , Análise de Sistemas
6.
Database (Oxford) ; 20182018 01 01.
Artigo em Inglês | MEDLINE | ID: mdl-29688366

RESUMO

Biotechnology revolution generates a plethora of omics data with an exponential growth pace. Therefore, biological data mining demands automatic, 'high quality' curation efforts to organize biomedical knowledge into online databases. BioDataome is a database of uniformly preprocessed and disease-annotated omics data with the aim to promote and accelerate the reuse of public data. We followed the same preprocessing pipeline for each biological mart (microarray gene expression, RNA-Seq gene expression and DNA methylation) to produce ready for downstream analysis datasets and automatically annotated them with disease-ontology terms. We also designate datasets that share common samples and automatically discover control samples in case-control studies. Currently, BioDataome includes ∼5600 datasets, ∼260 000 samples spanning ∼500 diseases and can be easily used in large-scale massive experiments and meta-analysis. All datasets are publicly available for querying and downloading via BioDataome web application. We demonstrate BioDataome's utility by presenting exploratory data analysis examples. We have also developed BioDataome R package found in: https://github.com/mensxmachina/BioDataome/.Database URL: http://dataome.mensxmachina.org/.


Assuntos
Curadoria de Dados/métodos , Bases de Dados Genéticas , Processamento Eletrônico de Dados/métodos , Perfilação da Expressão Gênica , Regulação da Expressão Gênica , Análise de Sequência com Séries de Oligonucleotídeos , Metanálise como Assunto
7.
PLoS One ; 12(8): e0182138, 2017.
Artigo em Inglês | MEDLINE | ID: mdl-28771511

RESUMO

Racial and ethnic differences in drug responses are now well studied and documented. Pharmacogenomics research seeks to unravel the genetic underpinnings of inter-individual variability with the aim of tailored-made theranostics and therapeutics. Taking into account the differential expression of pharmacogenes coding for key metabolic enzymes and transporters that affect drug pharmacokinetics and pharmacodynamics, we advise that data interpretation and analysis need to occur in light of geographical ancestry, if implications for drug development and global health are to be considered. Herein, we exploit ePGA, a web-based electronic Pharmacogenomics Assistant and publicly available genetic data from the 1000 Genomes Project to explore genotype to phenotype associations among the 1000 Genomes Project populations.


Assuntos
Genoma Humano , Metagenômica , Grupos Populacionais/genética , Sistema Enzimático do Citocromo P-450/genética , Bases de Dados Factuais , Frequência do Gene , Estudos de Associação Genética , Genótipo , Haplótipos , Humanos , Fenótipo , Interface Usuário-Computador
8.
PLoS One ; 11(9): e0162801, 2016.
Artigo em Inglês | MEDLINE | ID: mdl-27631363

RESUMO

One of the challenges that arise from the advent of personal genomics services is to efficiently couple individual data with state of the art Pharmacogenomics (PGx) knowledge. Existing services are limited to either providing static views of PGx variants or applying a simplistic match between individual genotypes and existing PGx variants. Moreover, there is a considerable amount of haplotype variation associated with drug metabolism that is currently insufficiently addressed. Here, we present a web-based electronic Pharmacogenomics Assistant (ePGA; http://www.epga.gr/) that provides personalized genotype-to-phenotype translation, linked to state of the art clinical guidelines. ePGA's translation service matches individual genotype-profiles with PGx gene haplotypes and infers the corresponding diplotype and phenotype profiles, accompanied with summary statistics. Additional features include i) the ability to customize translation based on subsets of variants of clinical interest, and ii) to update the knowledge base with novel PGx findings. We demonstrate ePGA's functionality on genetic variation data from the 1000 Genomes Project.


Assuntos
Sistemas de Informação , Internet , Farmacogenética , Modelos Teóricos
9.
Open Biol ; 4(7)2014 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-25030607

RESUMO

In the post-genomic era, the rapid evolution of high-throughput genotyping technologies and the increased pace of production of genetic research data are continually prompting the development of appropriate informatics tools, systems and databases as we attempt to cope with the flood of incoming genetic information. Alongside new technologies that serve to enhance data connectivity, emerging information systems should contribute to the creation of a powerful knowledge environment for genotype-to-phenotype information in the context of translational medicine. In the area of pharmacogenomics and personalized medicine, it has become evident that database applications providing important information on the occurrence and consequences of gene variants involved in pharmacokinetics, pharmacodynamics, drug efficacy and drug toxicity will become an integral tool for researchers and medical practitioners alike. At the same time, two fundamental issues are inextricably linked to current developments, namely data sharing and data protection. Here, we discuss high-throughput and next-generation sequencing technology and its impact on pharmacogenomics research. In addition, we present advances and challenges in the field of pharmacogenomics information systems which have in turn triggered the development of an integrated electronic 'pharmacogenomics assistant'. The system is designed to provide personalized drug recommendations based on linked genotype-to-phenotype pharmacogenomics data, as well as to support biomedical researchers in the identification of pharmacogenomics-related gene variants. The provisioned services are tuned in the framework of a single-access pharmacogenomics portal.


Assuntos
Genômica/métodos , Farmacogenética/métodos , Genoma , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Humanos , Medicina de Precisão/métodos
10.
Langmuir ; 22(5): 2329-33, 2006 Feb 28.
Artigo em Inglês | MEDLINE | ID: mdl-16489825

RESUMO

The wetting characteristics of surfaces of polymers doped with photochromic spiropyran molecules can be tuned when irradiated with laser beams of properly chosen photon energy. The hydrophilicity is enhanced upon UV laser irradiation since the embedded nonpolar spiropyran molecules convert to their polar merocyanine isomers. The process is reversed upon green laser irradiation. Structuring of the photochromic polymeric surfaces with soft lithography enhances significantly the hydrophobicity of the system, indicating that the water droplets on the patterned features interact with air that is trapped in the microcavities, thus creating superhydrophobic air-water contact areas. Furthermore, the light-induced wettability variations of the structured surfaces are enhanced by a factor of 3 compared to those on the flat surfaces. This significant enhancement is attributed to the photoinduced reversible volume changes to the imprinted gratings, which additionally contribute to the wettability changes due to the light-induced photochromic interconversions.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...