Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 18 de 18
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
bioRxiv ; 2024 Mar 27.
Artigo em Inglês | MEDLINE | ID: mdl-38585727

RESUMO

Analyzing taxonomic diversity and identification in diverse ecological samples has become a crucial routine in various research and industrial fields. While DNA barcoding marker-gene approaches were once prevalent, the decreasing costs of next-generation sequencing have made metagenomic shotgun sequencing more popular and feasible. In contrast to DNA-barcoding, metagenomic shotgun sequencing offers possibilities for in-depth characterization of structural and functional diversity. However, analysis of such data is still considered a hurdle due to absence of taxa-specific databases. Here we present taxonize-gb, a command-line software tool to extract GenBank non-redundant nucleotide and protein databases, related to one or more input taxonomy identifier. Our tool allows the creation of taxa-specific reference databases tailored to specific research questions, which reduces search times and therefore represents a practical solution for researchers analyzing large metagenomic data on regular basis. Taxonize-gb is an open-source command-line Python-based tool freely available for installation at https://pypi.org/project/taxonize-gb/ and on GitHub https://github.com/msabrysarhan/taxonize_genbank. It is released under Creative Commons Attribution-NonCommercial 4.0 International License (CC BY-NC 4.0).

2.
Hum Mol Genet ; 2024 Apr 20.
Artigo em Inglês | MEDLINE | ID: mdl-38643062

RESUMO

Genotype imputation is widely used in genome-wide association studies (GWAS). However, both the genotyping chips and imputation reference panels are dependent on next-generation sequencing (NGS). Due to the nature of NGS, some regions of the genome are inaccessible to sequencing. To date, there has been no complete evaluation of these regions and their impact on the identification of associations in GWAS remains unclear. In this study, we systematically assess the extent to which variants in inaccessible regions are underrepresented on genotyping chips and imputation reference panels, in GWAS results and in variant databases. We also determine the proportion of genes located in inaccessible regions and compare the results across variant masks defined by the 1000 Genomes Project and the TOPMed program. Overall, fewer variants were observed in inaccessible regions in all categories analyzed. Depending on the mask used and normalized for region size, only 4%-17% of the genotyped variants are located in inaccessible regions and 52 to 581 genes were almost completely inaccessible. From the Cooperative Health Research in South Tyrol (CHRIS) study, we present a case study of an association located in an inaccessible region that is driven by genotyped variants and cannot be reproduced by imputation in GRCh37. We conclude that genotyping, NGS, genotype imputation and downstream analyses such as GWAS and fine mapping are systematically biased in inaccessible regions, due to missed variants and spurious associations. To help researchers assess gene and variant accessibility, we provide an online application (https://gab.gm.eurac.edu).

3.
NAR Genom Bioinform ; 6(1): lqae015, 2024 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-38327871

RESUMO

Genome-wide association studies (GWAS) are transforming genetic research and enable the detection of novel genotype-phenotype relationships. In the last two decades, over 60 000 genetic associations across thousands of traits have been discovered using a GWAS approach. Due to increasing sample sizes, researchers are increasingly faced with computational challenges. A reproducible, modular and extensible pipeline with a focus on parallelization is essential to simplify data analysis and to allow researchers to devote their time to other essential tasks. Here we present nf-gwas, a Nextflow pipeline to run biobank-scale GWAS analysis. The pipeline automatically performs numerous pre- and post-processing steps, integrates regression modeling from the REGENIE package and supports single-variant, gene-based and interaction testing. It includes an extensive reporting functionality that allows to inspect thousands of phenotypes and navigate interactive Manhattan plots directly in the web browser. The pipeline is tested using the unit-style testing framework nf-test, a crucial requirement in clinical and pharmaceutical settings. Furthermore, we validated the pipeline against published GWAS datasets and benchmarked the pipeline on high-performance computing and cloud infrastructures to provide cost estimations to end users. nf-gwas is a highly parallelized, scalable and well-tested Nextflow pipeline to perform GWAS analysis in a reproducible manner.

4.
Sci Rep ; 14(1): 2083, 2024 01 24.
Artigo em Inglês | MEDLINE | ID: mdl-38267512

RESUMO

Mitochondrial DNA copy number (mtDNA-CN) is a biomarker for mitochondrial dysfunction associated with several diseases. Previous genome-wide association studies (GWAS) have been performed to unravel underlying mechanisms of mtDNA-CN regulation. However, the identified gene regions explain only a small fraction of mtDNA-CN variability. Most of this data has been estimated from microarrays based on various pipelines. In the present study we aimed to (1) identify genetic loci for qPCR-measured mtDNA-CN from three studies (16,130 participants) using GWAS, (2) identify potential systematic differences between our qPCR derived mtDNA-CN measurements compared to the published microarray intensity-based estimates, and (3) disentangle the nuclear from mitochondrial regulation of the mtDNA-CN phenotype. We identified two genome-wide significant autosomal loci associated with qPCR-measured mtDNA-CN: at HBS1L (rs4895440, p = 3.39 × 10-13) and GSDMA (rs56030650, p = 4.85 × 10-08) genes. Moreover, 113/115 of the previously published SNPs identified by microarray-based analyses were significantly equivalent with our findings. In our study, the mitochondrial genome itself contributed only marginally to mtDNA-CN regulation as we only detected a single rare mitochondrial variant associated with mtDNA-CN. Furthermore, we incorporated mitochondrial haplogroups into our analyses to explore their potential impact on mtDNA-CN. However, our findings indicate that they do not exert any significant influence on our results.


Assuntos
Variações do Número de Cópias de DNA , DNA Mitocondrial , Humanos , DNA Mitocondrial/genética , Variações do Número de Cópias de DNA/genética , Estudo de Associação Genômica Ampla , Mitocôndrias/genética , Loci Gênicos , Gasderminas
5.
Mol Autism ; 14(1): 22, 2023 06 28.
Artigo em Inglês | MEDLINE | ID: mdl-37381037

RESUMO

BACKGROUND: Autism spectrum disorder (ASD) is a set of highly heterogeneous neurodevelopmental diseases whose genetic etiology is not completely understood. Several investigations have relied on transcriptome analysis from peripheral tissues to dissect ASD into homogenous molecular phenotypes. Recently, analysis of changes in gene expression from postmortem brain tissues has identified sets of genes that are involved in pathways previously associated with ASD etiology. In addition to protein-coding transcripts, the human transcriptome is composed by a large set of non-coding RNAs and transposable elements (TEs). Advancements in sequencing technologies have proven that TEs can be transcribed in a regulated fashion, and their dysregulation might have a role in brain diseases. METHODS: We exploited published datasets comprising RNA-seq data from (1) postmortem brain of ASD subjects, (2) in vitro cell cultures where ten different ASD-relevant genes were knocked out and (3) blood of discordant siblings. We measured the expression levels of evolutionarily young full-length transposable L1 elements and characterized the genomic location of deregulated L1s assessing their potential impact on the transcription of ASD-relevant genes. We analyzed every sample independently, avoiding to pool together the disease subjects to unmask the heterogeneity of the molecular phenotypes. RESULTS: We detected a strong upregulation of intronic full-length L1s in a subset of postmortem brain samples and in in vitro differentiated neurons from iPSC knocked out for ATRX. L1 upregulation correlated with an high number of deregulated genes and retained introns. In the anterior cingulate cortex of one subject, a small number of significantly upregulated L1s overlapped with ASD-relevant genes that were significantly downregulated, suggesting the possible existence of a negative effect of L1 transcription on host transcripts. LIMITATIONS: Our analyses must be considered exploratory and will need to be validated in bigger cohorts. The main limitation is given by the small sample size and by the lack of replicates for postmortem brain samples. Measuring the transcription of locus-specific TEs is complicated by the repetitive nature of their sequence, which reduces the accuracy in mapping sequencing reads to the correct genomic locus. CONCLUSIONS: L1 upregulation in ASD appears to be limited to a subset of subjects that are also characterized by a general deregulation of the expression of canonical genes and an increase in intron retention. In some samples from the anterior cingulate cortex, L1s upregulation seems to directly impair the expression of some ASD-relevant genes by a still unknown mechanism. L1s upregulation may therefore identify a group of ASD subjects with common molecular features and helps stratifying individuals for novel strategies of therapeutic intervention.


Assuntos
Transtorno do Espectro Autista , Transtorno Autístico , Humanos , Transtorno Autístico/genética , Transtorno do Espectro Autista/genética , Retroelementos/genética , Encéfalo , Regulação para Cima
6.
Transl Psychiatry ; 13(1): 109, 2023 04 03.
Artigo em Inglês | MEDLINE | ID: mdl-37012247

RESUMO

While the genetics of autism spectrum disorders (ASD) has been intensively studied, resulting in the identification of over 100 putative risk genes, the epigenetics of ASD has received less attention, and results have been inconsistent across studies. We aimed to investigate the contribution of DNA methylation (DNAm) to the risk of ASD and identify candidate biomarkers arising from the interaction of epigenetic mechanisms with genotype, gene expression, and cellular proportions. We performed DNAm differential analysis using whole blood samples from 75 discordant sibling pairs of the Italian Autism Network collection and estimated their cellular composition. We studied the correlation between DNAm and gene expression accounting for the potential effects of different genotypes on DNAm. We showed that the proportion of NK cells was significantly reduced in ASD siblings suggesting an imbalance in their immune system. We identified differentially methylated regions (DMRs) involved in neurogenesis and synaptic organization. Among candidate loci for ASD, we detected a DMR mapping to CLEC11A (neighboring SHANK1) where DNAm and gene expression were significantly and negatively correlated, independently from genotype effects. As reported in previous studies, we confirmed the involvement of immune functions in the pathophysiology of ASD. Notwithstanding the complexity of the disorder, suitable biomarkers such as CLEC11A and its neighbor SHANK1 can be discovered using integrative analyses even with peripheral tissues.


Assuntos
Transtorno do Espectro Autista , Transtorno Autístico , Humanos , Transtorno Autístico/genética , Irmãos , Metilação de DNA , Epigênese Genética , Transtorno do Espectro Autista/genética , Transtorno do Espectro Autista/metabolismo , Biomarcadores/metabolismo , Expressão Gênica
7.
Int J Mol Sci ; 24(3)2023 Jan 19.
Artigo em Inglês | MEDLINE | ID: mdl-36768321

RESUMO

Autosomal dominant mutations in the gene encoding α-synuclein (SNCA) were the first to be linked with hereditary Parkinson's disease (PD). Duplication and triplication of SNCA has been observed in PD patients, together with mutations at the N-terminal of the protein, among which A30P and A53T influence the formation of fibrils. By overexpressing human α-synuclein in the neuronal system of Drosophila, we functionally validated the ability of IP3K2, an ortholog of the GWAS identified risk gene, Inositol-trisphosphate 3-kinase B (ITPKB), to modulate α-synuclein toxicity in vivo. ITPKB mRNA and protein levels were also increased in SK-N-SH cells overexpressing wild-type α-synuclein, A53T or A30P mutants. Kinase overexpression was detected in the cytoplasmatic and in the nuclear compartments in all α-synuclein cell types. By quantifying mRNAs in the cortex of PD patients, we observed higher levels of ITPKB mRNA when SNCA was expressed more (p < 0.05), compared to controls. A positive correlation was also observed between SNCA and ITPKB expression in the cortex of patients, which was not seen in the controls. We replicated this observation in a public dataset. Our data, generated in SK-N-SH cells and in cortex from PD patients, show that the expression of α-synuclein and ITPKB is correlated in pathological situations.


Assuntos
Doença de Parkinson , alfa-Sinucleína , Humanos , alfa-Sinucleína/genética , alfa-Sinucleína/metabolismo , Mutação , Neurônios/metabolismo , Doença de Parkinson/genética , Doença de Parkinson/metabolismo
8.
Nature ; 614(7946): 125-135, 2023 02.
Artigo em Inglês | MEDLINE | ID: mdl-36653448

RESUMO

The human microbiome is an integral component of the human body and a co-determinant of several health conditions1,2. However, the extent to which interpersonal relations shape the individual genetic makeup of the microbiome and its transmission within and across populations remains largely unknown3,4. Here, capitalizing on more than 9,700 human metagenomes and computational strain-level profiling, we detected extensive bacterial strain sharing across individuals (more than 10 million instances) with distinct mother-to-infant, intra-household and intra-population transmission patterns. Mother-to-infant gut microbiome transmission was considerable and stable during infancy (around 50% of the same strains among shared species (strain-sharing rate)) and remained detectable at older ages. By contrast, the transmission of the oral microbiome occurred largely horizontally and was enhanced by the duration of cohabitation. There was substantial strain sharing among cohabiting individuals, with 12% and 32% median strain-sharing rates for the gut and oral microbiomes, and time since cohabitation affected strain sharing more than age or genetics did. Bacterial strain sharing additionally recapitulated host population structures better than species-level profiles did. Finally, distinct taxa appeared as efficient spreaders across transmission modes and were associated with different predicted bacterial phenotypes linked with out-of-host survival capabilities. The extent of microorganism transmission that we describe underscores its relevance in human microbiome studies5, especially those on non-infectious, microbiome-associated diseases.


Assuntos
Bactérias , Transmissão de Doença Infecciosa , Microbioma Gastrointestinal , Ambiente Domiciliar , Microbiota , Boca , Feminino , Humanos , Lactente , Bactérias/classificação , Bactérias/genética , Bactérias/isolamento & purificação , Microbioma Gastrointestinal/genética , Metagenoma , Microbiota/genética , Mães , Boca/microbiologia , Transmissão Vertical de Doenças Infecciosas , Características da Família , Envelhecimento , Fatores de Tempo , Viabilidade Microbiana
9.
Front Genet ; 12: 745229, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-34880900

RESUMO

CHD8 represents one of the highest confidence genetic risk factors implied in Autism Spectrum Disorders, with most mutations leading to CHD8 haploinsufficiency and the insurgence of specific phenotypes, such as macrocephaly, facial dysmorphisms, intellectual disability, and gastrointestinal complaints. While extensive studies have been conducted on the possible consequences of CHD8 suppression and protein coding RNAs dysregulation during neuronal development, the effects of transcriptional changes of long non-coding RNAs (lncRNAs) remain unclear. In this study, we focused on a peculiar class of natural antisense lncRNAs, SINEUPs, that enhance translation of a target mRNA through the activity of two RNA domains, an embedded transposable element sequence and an antisense region. By looking at dysregulated transcripts following CHD8 knock down (KD), we first identified RAB11B-AS1 as a potential SINEUP RNA for its domain configuration. Then we demonstrated that such lncRNA is able to increase endogenous RAB11B protein amounts without affecting its transcriptional levels. RAB11B has a pivotal role in vesicular trafficking, and mutations on this gene correlate with intellectual disability and microcephaly. Thus, our study discloses an additional layer of molecular regulation which is altered by CHD8 suppression. This represents the first experimental confirmation that naturally occurring SINEUP could be involved in ASD pathogenesis and underscores the importance of dysregulation of functional lncRNAs in neurodevelopment.

10.
FASEB Bioadv ; 2(7): 434-448, 2020 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-32676583

RESUMO

Expression of the bHLH transcription protein Atoh7 is a crucial factor conferring competence to retinal progenitor cells for the development of retinal ganglion cells. Several studies have emerged establishing ATOH7 as a retinal disease gene. Remarkably, such studies uncovered ATOH7 variants associated with global eye defects including optic nerve hypoplasia, microphthalmia, retinal vascular disorders, and glaucoma. The complex genetic networks and cellular decisions arising downstream of atoh7 expression, and how their dysregulation cause development of such disease traits remains unknown. To begin to understand such Atoh7-dependent events in vivo, we performed transcriptome analysis of wild-type and atoh7 mutant (lakritz) zebrafish embryos at the onset of retinal ganglion cell differentiation. We investigated in silico interplays of atoh7 and other disease-related genes and pathways. By network reconstruction analysis of differentially expressed genes, we identified gene clusters enriched in retinal development, cell cycle, chromatin remodeling, stress response, and Wnt pathways. By weighted gene coexpression network, we identified coexpression modules affected by the mutation and enriched in retina development genes tightly connected to atoh7. We established the groundwork whereby Atoh7-linked cellular and molecular processes can be investigated in the dynamic multi-tissue environment of the developing normal and diseased vertebrate eye.

11.
Transl Psychiatry ; 10(1): 106, 2020 04 14.
Artigo em Inglês | MEDLINE | ID: mdl-32291385

RESUMO

Notwithstanding several research efforts in the past years, robust and replicable molecular signatures for autism spectrum disorders from peripheral blood remain elusive. The available literature on blood transcriptome in ASD suggests that through accurate experimental design it is possible to extract important information on the disease pathophysiology at the peripheral level. Here we exploit the availability of a resource for molecular biomarkers in ASD, the Italian Autism Network (ITAN) collection, for the investigation of transcriptomic signatures in ASD based on a discordant sibling pair design. Whole blood samples from 75 discordant sibling pairs selected from the ITAN network where submitted to RNASeq analysis and data analyzed by complementary approaches. Overall, differences in gene expression between affected and unaffected siblings were small. In order to assess the contribution of differences in the relative proportion of blood cells between discordant siblings, we have applied two different cell deconvolution algorithms, showing that the observed molecular signatures mainly reflect changes in peripheral blood immune cell composition, in particular NK cells. The results obtained by the cell deconvolution approach are supported by the analysis performed by WGCNA. Our report describes the largest differential gene expression profiling in peripheral blood of ASD subjects and controls conducted by RNASeq. The observed signatures are consistent with the hypothesis of immune alterations in autism and an increased risk of developing autism in subjects exposed to prenatal infections or stress. Our study also points to a potential role of NMUR1, HMGB3, and PTPRN2 in ASD.


Assuntos
Transtorno do Espectro Autista , Transtorno Autístico , Transtorno do Espectro Autista/genética , Células Sanguíneas , Humanos , Irmãos , Transcriptoma
12.
Transl Psychiatry ; 9(1): 182, 2019 08 02.
Artigo em Inglês | MEDLINE | ID: mdl-31375659

RESUMO

The identification of biomarkers of response might speed drug development and set the premises to assist clinical practice in psychiatry. In this work, we evaluated a panel of peripheral biomarkers (including IL-6, IL-10, TNF-α, TNFRII, BDNF, CRP, MMP9 and PAI1) in depressed patients receiving paroxetine, venlafaxine, or placebo. Samples were obtained from two randomised placebo-controlled studies evaluating the efficacy and tolerability of a novel drug candidate, using either paroxetine or venlafaxine as active comparators. In both studies, the biomarker candidates were analysed in plasma collected at randomization and after 10 weeks of treatment with either placebo or active comparator (for a total of 106 and 108 subjects in the paroxetine and venlafaxine study, respectively). Data were obtained by multiplexing sandwich-ELISA system. Data were subjected to statistical analysis to assess their correlation with baseline severity and with response outcome. Increases in biomarker levels were correlated with reduction in depression severity for TNF-α, IL-6 IL-10 and CRP. Response to paroxetine treatment correlated with baseline IL-10, IL-6 and TNF-α levels, with the strongest signal being observed in males. In the venlafaxine study, a correlation was observed only between CRP level at randomisation and response, suggesting differences between the two active treatments and the two studies. Our investigations suggest that a combination of pro- and anti-inflammatory cytokines may predict response outcome in patients treated with paroxetine. The potential for IL-10, IL-6 and TNF-α as response biomarkers for a wider range of antidepressants warrants further investigations in clinical trials with other monoamine reuptake inhibitors.


Assuntos
Proteína C-Reativa/análise , Transtorno Depressivo Maior/tratamento farmacológico , Interleucina-10/sangue , Interleucina-6/sangue , Paroxetina/uso terapêutico , Fator de Necrose Tumoral alfa/sangue , Cloridrato de Venlafaxina/uso terapêutico , Adulto , Biomarcadores/sangue , Transtorno Depressivo Maior/sangue , Feminino , Humanos , Masculino , Pessoa de Meia-Idade , Resultado do Tratamento , Adulto Jovem
13.
Methods Mol Biol ; 1883: 323-346, 2019.
Artigo em Inglês | MEDLINE | ID: mdl-30547407

RESUMO

Reconstructing a gene regulatory network from one or more sets of omics measurements has been a major task of computational biology in the last 20 years. Despite an overwhelming number of algorithms proposed to solve the network inference problem either in the general scenario or in an ad-hoc tailored situation, assessing the stability of reconstruction is still an uncharted territory and exploratory studies mainly tackled theoretical aspects. We introduce here empirical stability, which is induced by variability of reconstruction as a function of data subsampling. By evaluating differences between networks that are inferred using different subsets of the same data we obtain quantitative indicators of the robustness of the algorithm, of the noise level affecting the data, and, overall, of the reliability of the reconstructed graph. We show that empirical stability can be used whenever no ground truth is available to compute a direct measure of the similarity between the inferred structure and the true network. The main ingredient here is a suite of indicators, called NetSI, providing statistics of distances between graphs generated by a given algorithm fed with different data subsets, where the chosen metric is the Hamming-Ipsen-Mikhailov (HIM) distance evaluating dissimilarity of graph topologies with shared nodes. Operatively, the NetSI family is demonstrated here on synthetic and high-throughput datasets, inferring graphs at different resolution levels (topology, direction, weight), showing how the stability indicators can be effectively used for the quantitative comparison of the stability of different reconstruction algorithms.


Assuntos
Algoritmos , Biologia Computacional/métodos , Redes Reguladoras de Genes , Modelos Genéticos , Biologia Computacional/instrumentação , Conjuntos de Dados como Assunto , Perfilação da Expressão Gênica/instrumentação , Perfilação da Expressão Gênica/métodos , Genoma Humano/genética , Genômica/instrumentação , Genômica/métodos , Humanos , Proteômica/instrumentação , Proteômica/métodos
14.
BMC Psychiatry ; 18(1): 369, 2018 11 21.
Artigo em Inglês | MEDLINE | ID: mdl-30463616

RESUMO

BACKGROUND: A substantial genetic component accounts for Autism Spectrum Disorders (ASD) aetiology, with some rare and common genetic risk factors recently identified. Large collections of DNAs from thoroughly characterized ASD families are an essential step to confirm genetic risk factors, identify new variants and investigate genotype-phenotype correlations. The Italian Autism Network aimed at constituting a clinical database and a biorepository of samples derived from ASD subjects and first-degree relatives extensively and consistently characterized by child psychiatry centers in Italy. METHODS: The study was approved by the ethical committee of the University of Verona, the coordinating site, and by the local ethical committees of each recruiting site. Certified staff was specifically trained at each site for the overall study conduct, for clinical protocol administration and handling of biological material. A centralized database was developed to collect clinical assessment and medical records from each recruiting site. Children were eligible for recruitment based on the following inclusion criteria: age 4-18 years, at least one parent or legal guardian giving voluntary written consent, meeting DSM-IV criteria for Autistic Disorder or Asperger's Disorder or Pervasive Developmental Disorder NOS. Affected individuals were assessed by full psychiatric, neurological and physical examination, evaluation with ADI-R and ADOS scales, cognitive assessment with Wechsler Intelligence Scale for Children or Preschool and Primary, Leiter International Performance Scale or Griffiths Mental Developmental Scale. Additional evaluations included language assessment, the Krug Asperger's Disorder Index, and instrumental examination such as EEG and structural MRI. DNA, RNA and plasma were collected from eligible individuals and relatives. A central laboratory was established to host the biorepository, perform DNA and RNA extraction and lymphocytes immortalisation. DISCUSSION: The study has led to an extensive collection of biological samples associated with standardised clinical assessments from a network of expert clinicians and psychologists. Eighteen sites have received ADI/ADOS training, thirteen of which have been actively recruiting. The clinical database currently includes information on 812 individuals from 249 families, and the biorepository has samples for 98% of the subjects. This effort has generated a highly valuable resource for conducting clinical and genetic research of ASD, amenable to further expansion.


Assuntos
Síndrome de Asperger , Transtorno do Espectro Autista , Bancos de Espécimes Biológicos/organização & administração , Transtornos Globais do Desenvolvimento Infantil , Bases de Dados como Assunto/organização & administração , Adolescente , Síndrome de Asperger/sangue , Síndrome de Asperger/genética , Transtorno do Espectro Autista/sangue , Transtorno do Espectro Autista/genética , Biomarcadores/sangue , Criança , Transtornos Globais do Desenvolvimento Infantil/sangue , Transtornos Globais do Desenvolvimento Infantil/genética , Pré-Escolar , Feminino , Recursos em Saúde , Humanos , Itália , Masculino , Prontuários Médicos
15.
PLoS One ; 11(3): e0152648, 2016.
Artigo em Inglês | MEDLINE | ID: mdl-27031641

RESUMO

When modeling coexpression networks from high-throughput time course data, Pearson Correlation Coefficient (PCC) is one of the most effective and popular similarity functions. However, its reliability is limited since it cannot capture non-linear interactions and time shifts. Here we propose to overcome these two issues by employing a novel similarity function, Dynamic Time Warping Maximal Information Coefficient (DTW-MIC), combining a measure taking care of functional interactions of signals (MIC) and a measure identifying time lag (DTW). By using the Hamming-Ipsen-Mikhailov (HIM) metric to quantify network differences, the effectiveness of the DTW-MIC approach is demonstrated on a set of four synthetic and one transcriptomic datasets, also in comparison to TimeDelay ARACNE and Transfer Entropy.


Assuntos
Biologia Computacional/métodos , Algoritmos , Escherichia coli/genética , Escherichia coli/metabolismo , Redes Reguladoras de Genes , Humanos , Redes e Vias Metabólicas , Saccharomyces cerevisiae/genética , Saccharomyces cerevisiae/metabolismo , Linfócitos T/metabolismo
16.
Nat Biotechnol ; 32(9): 926-32, 2014 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-25150839

RESUMO

The concordance of RNA-sequencing (RNA-seq) with microarrays for genome-wide analysis of differential gene expression has not been rigorously assessed using a range of chemical treatment conditions. Here we use a comprehensive study design to generate Illumina RNA-seq and Affymetrix microarray data from the same liver samples of rats exposed in triplicate to varying degrees of perturbation by 27 chemicals representing multiple modes of action (MOAs). The cross-platform concordance in terms of differentially expressed genes (DEGs) or enriched pathways is linearly correlated with treatment effect size (R(2)0.8). Furthermore, the concordance is also affected by transcript abundance and biological complexity of the MOA. RNA-seq outperforms microarray (93% versus 75%) in DEG verification as assessed by quantitative PCR, with the gain mainly due to its improved accuracy for low-abundance transcripts. Nonetheless, classifiers to predict MOAs perform similarly when developed using data from either platform. Therefore, the endpoint studied and its biological complexity, transcript abundance and the genomic application are important factors in transcriptomic research and for clinical and regulatory decision making.


Assuntos
Análise de Sequência com Séries de Oligonucleotídeos , RNA Mensageiro/genética , Análise de Sequência de RNA , Animais , Ratos
17.
PLoS One ; 9(2): e89815, 2014.
Artigo em Inglês | MEDLINE | ID: mdl-24587057

RESUMO

The number of available algorithms to infer a biological network from a dataset of high-throughput measurements is overwhelming and keeps growing. However, evaluating their performance is unfeasible unless a 'gold standard' is available to measure how close the reconstructed network is to the ground truth. One measure of this is the stability of these predictions to data resampling approaches. We introduce NetSI, a family of Network Stability Indicators, to assess quantitatively the stability of a reconstructed network in terms of inference variability due to data subsampling. In order to evaluate network stability, the main NetSI methods use a global/local network metric in combination with a resampling (bootstrap or cross-validation) procedure. In addition, we provide two normalized variability scores over data resampling to measure edge weight stability and node degree stability, and then introduce a stability ranking for edges and nodes. A complete implementation of the NetSI indicators, including the Hamming-Ipsen-Mikhailov (HIM) network distance adopted in this paper is available with the R package nettools. We demonstrate the use of the NetSI family by measuring network stability on four datasets against alternative network reconstruction methods. First, the effect of sample size on stability of inferred networks is studied in a gold standard framework on yeast-like data from the Gene Net Weaver simulator. We also consider the impact of varying modularity on a set of structurally different networks (50 nodes, from 2 to 10 modules), and then of complex feature covariance structure, showing the different behaviours of standard reconstruction methods based on Pearson correlation, Maximum Information Coefficient (MIC) and False Discovery Rate (FDR) strategy. Finally, we demonstrate a strong combined effect of different reconstruction methods and phenotype subgroups on a hepatocellular carcinoma miRNA microarray dataset (240 subjects), and we validate the analysis on a second dataset (166 subjects) with good reproducibility.


Assuntos
Modelos Biológicos , Redes Neurais de Computação , Algoritmos , Carcinoma Hepatocelular/genética , Redes Reguladoras de Genes , Humanos , Neoplasias Hepáticas/genética , MicroRNAs/genética , Leveduras/fisiologia
18.
Bioinformatics ; 29(3): 407-8, 2013 Feb 01.
Artigo em Inglês | MEDLINE | ID: mdl-23242262

RESUMO

UNLABELLED: We introduce a novel implementation in ANSI C of the MINE family of algorithms for computing maximal information-based measures of dependence between two variables in large datasets, with the aim of a low memory footprint and ease of integration within bioinformatics pipelines. We provide the libraries minerva (with the R interface) and minepy for Python, MATLAB, Octave and C++. The C solution reduces the large memory requirement of the original Java implementation, has good upscaling properties and offers a native parallelization for the R interface. Low memory requirements are demonstrated on the MINE benchmarks as well as on large ( = 1340) microarray and Illumina GAII RNA-seq transcriptomics datasets. AVAILABILITY AND IMPLEMENTATION: Source code and binaries are freely available for download under GPL3 licence at http://minepy.sourceforge.net for minepy and through the CRAN repository http://cran.r-project.org for the R package minerva. All software is multiplatform (MS Windows, Linux and OSX).


Assuntos
Software , Algoritmos , Biologia Computacional , Mineração de Dados , Perfilação da Expressão Gênica , Metagenoma
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...