Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 6 de 6
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
Bioinformatics ; 40(5)2024 May 02.
Artigo em Inglês | MEDLINE | ID: mdl-38597877

RESUMO

MOTIVATION: Phylogenetics has moved into the era of genomics, incorporating enormous volumes of data to study questions at both shallow and deep scales. With this increase in information, phylogeneticists need new tools and skills to manipulate and analyze these data. To facilitate these tasks and encourage reproducibility, the community is increasingly moving toward automated workflows. RESULTS: Here we present pipesnake, a phylogenomics pipeline written in Nextflow for the processing, assembly, and phylogenetic estimation of genomic data from short-read sequences. pipesnake is an easy to use and efficient software package designed for this next era in phylogenetics. AVAILABILITY AND IMPLEMENTATION: pipesnake is publicly available on GitHub at https://github.com/AusARG/pipesnake and accompanied by documentation and a wiki/tutorial.


Assuntos
Genômica , Filogenia , Software , Genômica/métodos
2.
Brief Bioinform ; 22(5)2021 09 02.
Artigo em Inglês | MEDLINE | ID: mdl-33834181

RESUMO

MOTIVATION: The high accuracy of recent haplotype phasing tools is enabling the integration of haplotype (or phase) information more widely in genetic investigations. One such possibility is phase-aware expression quantitative trait loci (eQTL) analysis, where haplotype-based analysis has the potential to detect associations that may otherwise be missed by standard SNP-based approaches. RESULTS: We present eQTLHap, a novel method to investigate associations between gene expression and genetic variants, considering their haplotypic and genotypic effect. Using multiple simulations based on real data, we demonstrate that phase-aware eQTL analysis significantly outperforms typical SNP-based methods when the causal genetic architecture involves multiple SNPs. We show that phase-aware eQTL analysis is robust to phasing errors, showing only a minor impact ($<4\%$) on sensitivity. Applying eQTLHap to real GEUVADIS and GTEx datasets detects numerous novel eQTLs undetected by a single-SNP approach, with 22 eQTLs replicating across studies or tissue types, highlighting the utility of phase-aware eQTL analysis. AVAILABILITY AND IMPLEMENTATION: https://github.com/ziadbkh/eQTLHap. CONTACT: ziad.albkhetan@gmail.com. SUPPLEMENTARY INFORMATION: Supplementary data are available at Briefings in Bioinformatics online.


Assuntos
Biologia Computacional/métodos , Estudo de Associação Genômica Ampla/métodos , Haplótipos , Polimorfismo de Nucleotídeo Único , Locos de Características Quantitativas/genética , Algoritmos , Regulação da Expressão Gênica , Genótipo , Humanos , Internet , Desequilíbrio de Ligação
3.
Brief Bioinform ; 22(4)2021 07 20.
Artigo em Inglês | MEDLINE | ID: mdl-33236761

RESUMO

Haplotype phasing is a critical step for many genetic applications but incorrect estimates of phase can negatively impact downstream analyses. One proposed strategy to improve phasing accuracy is to combine multiple independent phasing estimates to overcome the limitations of any individual estimate. However, such a strategy is yet to be thoroughly explored. This study provides a comprehensive evaluation of consensus strategies for haplotype phasing. We explore the performance of different consensus paradigms, and the effect of specific constituent tools, across several datasets with different characteristics and their impact on the downstream task of genotype imputation. Based on the outputs of existing phasing tools, we explore two different strategies to construct haplotype consensus estimators: voting across outputs from multiple phasing tools and multiple outputs of a single non-deterministic tool. We find that the consensus approach from multiple tools reduces SE by an average of 10% compared to any constituent tool when applied to European populations and has the highest accuracy regardless of population ethnicity, sample size, variant density or variant frequency. Furthermore, the consensus estimator improves the accuracy of the downstream task of genotype imputation carried out by the widely used Minimac3, pbwt and BEAGLE5 tools. Our results provide guidance on how to produce the most accurate phasing estimates and the trade-offs that a consensus approach may have. Our implementation of consensus haplotype phasing, consHap, is available freely at https://github.com/ziadbkh/consHap. Supplementary information: Supplementary data are available at Briefings in Bioinformatics online.


Assuntos
Algoritmos , Bases de Dados de Ácidos Nucleicos , Polimorfismo de Nucleotídeo Único , Análise de Sequência de DNA , Haplótipos , Humanos
4.
BMC Bioinformatics ; 20(1): 540, 2019 Oct 30.
Artigo em Inglês | MEDLINE | ID: mdl-31666002

RESUMO

BACKGROUND: Knowledge of phase, the specific allele sequence on each copy of homologous chromosomes, is increasingly recognized as critical for detecting certain classes of disease-associated mutations. One approach for detecting such mutations is through phased haplotype association analysis. While the accuracy of methods for phasing genotype data has been widely explored, there has been little attention given to phasing accuracy at haplotype block scale. Understanding the combined impact of the accuracy of phasing tool and the method used to determine haplotype blocks on the error rate within the determined blocks is essential to conduct accurate haplotype analyses. RESULTS: We present a systematic study exploring the relationship between seven widely used phasing methods and two common methods for determining haplotype blocks. The evaluation focuses on the number of haplotype blocks that are incorrectly phased. Insights from these results are used to develop a haplotype estimator based on a consensus of three tools. The consensus estimator achieved the most accurate phasing in all applied tests. Individually, EAGLE2, BEAGLE and SHAPEIT2 alternate in being the best performing tool in different scenarios. Determining haplotype blocks based on linkage disequilibrium leads to more correctly phased blocks compared to a sliding window approach. We find that there is little difference between phasing sections of a genome (e.g. a gene) compared to phasing entire chromosomes. Finally, we show that the location of phasing error varies when the tools are applied to the same data several times, a finding that could be important for downstream analyses. CONCLUSIONS: The choice of phasing and block determination algorithms and their interaction impacts the accuracy of phased haplotype blocks. This work provides guidance and evidence for the different design choices needed for analyses using haplotype blocks. The study highlights a number of issues that may have limited the replicability of previous haplotype analysis.


Assuntos
Haplótipos , Algoritmos , Desequilíbrio de Ligação
5.
Methods ; 166: 83-90, 2019 08 15.
Artigo em Inglês | MEDLINE | ID: mdl-30853548

RESUMO

We present machine learning models of human genome three-dimensional structure that combine one dimensional (linear) sequence specificity, epigenomic information, and transcription factor binding profiles, with the polymer-based biophysical simulations in order to explain the extensive long-range chromatin looping observed in ChIA-PET experiments for lymphoblastoid cells. Random Forest, Gradient Boosting Machine (GBM), and Deep Learning models were constructed and evaluated, when predicting high-resolution interactions within Topologically Associating Domains (TADs). The predicted interactions are consistent with the experimental long-read ChIA-PET interactions mediated by CTCF and RNAPOL2 for GM12878 cell line. The contribution of sequence information and chromatin state defined by epigenomic features to the prediction task is analyzed and reported, when using them separately and combined. Furthermore, we design three-dimensional models of chromatin contact domains (CCDs) using real (ChIA-PET) and predicted looping interactions. Initial results show a similarity between both types of 3D computational models (constructed from experimental or predicted interactions). This observation confirms the association between genome sequence, epigenomic and transcription factor profiles, and three-dimensional interactions.


Assuntos
Cromatina/ultraestrutura , Simulação por Computador , Epigenômica , Aprendizado de Máquina , Regulação da Expressão Gênica/genética , Genoma Humano , Humanos , Polímeros/química , Regiões Promotoras Genéticas/genética , Ligação Proteica/genética
6.
Sci Rep ; 8(1): 5217, 2018 03 26.
Artigo em Inglês | MEDLINE | ID: mdl-29581440

RESUMO

This study aims to understand through statistical learning the basic biophysical mechanisms behind three-dimensional folding of epigenomes. The 3DEpiLoop algorithm predicts three-dimensional chromatin looping interactions within topologically associating domains (TADs) from one-dimensional epigenomics and transcription factor profiles using the statistical learning. The predictions obtained by 3DEpiLoop are highly consistent with the reported experimental interactions. The complex signatures of epigenomic and transcription factors within the physically interacting chromatin regions (anchors) are similar across all genomic scales: genomic domains, chromosomal territories, cell types, and different individuals. We report the most important epigenetic and transcription factor features used for interaction identification either shared, or unique for each of sixteen (16) cell lines. The analysis shows that CTCF interaction anchors are enriched by transcription factors yet deficient in histone modifications, while the opposite is true in the case of RNAP II mediated interactions. The code is available at the repository https://bitbucket.org/4dnucleome/3depiloop .


Assuntos
Fator de Ligação a CCCTC/genética , Cromatina/genética , Genoma Humano/genética , RNA Polimerase II/genética , Animais , Linhagem Celular , Epigenômica , Regulação da Expressão Gênica/genética , Código das Histonas/genética , Humanos , Camundongos , Regiões Promotoras Genéticas/genética
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...