Pesquisa | Portal Regional da BVS (teste)

The C-SHIFT Algorithm for Normalizing Covariances.

Chunikhina, Evgenia; Logan, Paul; Kovchegov, Yevgeniy; Yambartsev, Anatoly; Mondal, Debashis; Morgun, Andrey.

IEEE/ACM Trans Comput Biol Bioinform ; 20(1): 720-730, 2023.

Artigo em Inglês | MEDLINE | ID: mdl-35167480

RESUMO

Omics technologies are powerful tools for analyzing patterns in gene expression data for thousands of genes. Due to a number of systematic variations in experiments, the raw gene expression data is often obfuscated by undesirable technical noises. Various normalization techniques were designed in an attempt to remove these non-biological errors prior to any statistical analysis. One of the reasons for normalizing data is the need for recovering the covariance matrix used in gene network analysis. In this paper, we introduce a novel normalization technique, called the covariance shift (C-SHIFT) method. This normalization algorithm uses optimization techniques together with the blessing of dimensionality philosophy and energy minimization hypothesis for covariance matrix recovery under additive noise (in biology, known as the bias). Thus, it is perfectly suited for the analysis of logarithmic gene expression data. Numerical experiments on synthetic data demonstrate the method's advantage over the classical normalization techniques. Namely, the comparison is made with Rank, Quantile, cyclic LOESS (locally estimated scatterplot smoothing), and MAD (median absolute deviation) normalization methods. We also evaluate the performance of C-SHIFT algorithm on real biological data.

Assuntos

Algoritmos , Perfilação da Expressão Gênica , Perfilação da Expressão Gênica/métodos

Noninvasive prenatal paternity determination using microhaplotypes: a pilot study.

Wang, Jaqueline Yu Ting; Whittle, Martin R; Puga, Renato David; Yambartsev, Anatoly; Fujita, André; Nakaya, Helder I.

BMC Med Genomics ; 13(1): 157, 2020 10 23.

Artigo em Inglês | MEDLINE | ID: mdl-33097049

RESUMO

BACKGROUND: The use of noninvasive techniques to determine paternity prenatally is increasing because it reduces the risks associated with invasive procedures. Current methods, based on SNPs, use the analysis of at least 148 markers, on average. METHODS: To reduce the number of regions, we used microhaplotypes, which are chromosomal segments smaller than 200 bp containing two or more SNPs. Our method employs massively parallel sequencing and analysis of microhaplotypes as genetic markers. We tested 20 microhaplotypes and ascertained that 19 obey Hardy-Weinberg equilibrium and are independent, and data from the 1000 Genomes Project were used for population frequency and simulations. RESULTS: We performed simulations of true and false paternity, using the 1000 Genomes Project data, to confirm if the microhaplotypes could be used as genetic markers. We observed that at least 13 microhaplotypes should be used to decrease the chances of false positives. Then, we applied the method in 31 trios, and it was able to correctly assign the fatherhood in cases where the alleged father was the real father, excluding the inconclusive results. We also cross evaluated the mother-plasma duos with the alleged fathers for false inclusions within our data, and we observed that the use of at least 15 microhaplotypes in real data also decreases the false inclusions. CONCLUSIONS: In this work, we demonstrated that microhaplotypes can be used to determine prenatal paternity by using only 15 regions and with admixtures of DNA.

Assuntos

DNA/análise , Marcadores Genéticos , Haplótipos , Teste Pré-Natal não Invasivo/métodos , Paternidade , Polimorfismo de Nucleotídeo Único , DNA/genética , Feminino , Testes Genéticos , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Masculino , Projetos Piloto , Gravidez

Unexpected links reflect the noise in networks.

Yambartsev, Anatoly; Perlin, Michael A; Kovchegov, Yevgeniy; Shulzhenko, Natalia; Mine, Karina L; Dong, Xiaoxi; Morgun, Andrey.

Biol Direct ; 11(1): 52, 2016 10 13.

Artigo em Inglês | MEDLINE | ID: mdl-27737689

RESUMO

BACKGROUND: Gene covariation networks are commonly used to study biological processes. The inference of gene covariation networks from observational data can be challenging, especially considering the large number of players involved and the small number of biological replicates available for analysis. RESULTS: We propose a new statistical method for estimating the number of erroneous edges in reconstructed networks that strongly enhances commonly used inference approaches. This method is based on a special relationship between sign of correlation (positive/negative) and directionality (up/down) of gene regulation, and allows for the identification and removal of approximately half of all erroneous edges. Using the mathematical model of Bayesian networks and positive correlation inequalities we establish a mathematical foundation for our method. Analyzing existing biological datasets, we find a strong correlation between the results of our method and false discovery rate (FDR). Furthermore, simulation analysis demonstrates that our method provides a more accurate estimate of network error than FDR. CONCLUSIONS: Thus, our study provides a new robust approach for improving reconstruction of covariation networks. REVIEWERS: This article was reviewed by Eugene Koonin, Sergei Maslov, Daniel Yasumasa Takahashi.

Assuntos

Biologia Computacional/métodos , Regulação da Expressão Gênica , Redes Reguladoras de Genes , Teorema de Bayes

Differentially correlated genes in co-expression networks control phenotype transitions.

Thomas, Lina D; Vyshenska, Dariia; Shulzhenko, Natalia; Yambartsev, Anatoly; Morgun, Andrey.

F1000Res ; 5: 2740, 2016.

Artigo em Inglês | MEDLINE | ID: mdl-28163897

RESUMO

BACKGROUND: Co-expression networks are a tool widely used for analysis of "Big Data" in biology that can range from transcriptomes to proteomes, metabolomes and more recently even microbiomes. Several methods were proposed to answer biological questions interrogating these networks. Differential co-expression analysis is a recent approach that measures how gene interactions change when a biological system transitions from one state to another. Although the importance of differentially co-expressed genes to identify dysregulated pathways has been noted, their role in gene regulation is not well studied. Herein we investigated differentially co-expressed genes in a relatively simple mono-causal process (B lymphocyte deficiency) and in a complex multi-causal system (cervical cancer). METHODS: Co-expression networks of B cell deficiency (Control and BcKO) were reconstructed using Pearson correlation coefficient for two mus musculus datasets: B10.A strain (12 normal, 12 BcKO) and BALB/c strain (10 normal, 10 BcKO). Co-expression networks of cervical cancer (normal and cancer) were reconstructed using local partial correlation method for five datasets (total of 64 normal, 148 cancer). Differentially correlated pairs were identified along with the location of their genes in BcKO and in cancer networks. Minimum Shortest Path and Bi-partite Betweenness Centrality where statistically evaluated for differentially co-expressed genes in corresponding networks. Results: We show that in B cell deficiency the differentially co-expressed genes are highly enriched with immunoglobulin genes (causal genes). In cancer we found that differentially co-expressed genes act as "bottlenecks" rather than causal drivers with most flows that come from the key driver genes to the peripheral genes passing through differentially co-expressed genes. Using in vitro knockdown experiments for two out of 14 differentially co-expressed genes found in cervical cancer (FGFR2 and CACYBP), we showed that they play regulatory roles in cancer cell growth. CONCLUSION: Identifying differentially co-expressed genes in co-expression networks is an important tool in detecting regulatory genes involved in alterations of phenotype.

Reverse enGENEering of Regulatory Networks from Big Data: A Roadmap for Biologists.

Dong, Xiaoxi; Yambartsev, Anatoly; Ramsey, Stephen A; Thomas, Lina D; Shulzhenko, Natalia; Morgun, Andrey.

Bioinform Biol Insights ; 9: 61-74, 2015.

Artigo em Inglês | MEDLINE | ID: mdl-25983554

RESUMO

Omics technologies enable unbiased investigation of biological systems through massively parallel sequence acquisition or molecular measurements, bringing the life sciences into the era of Big Data. A central challenge posed by such omics datasets is how to transform these data into biological knowledge, for example, how to use these data to answer questions such as: Which functional pathways are involved in cell differentiation? Which genes should we target to stop cancer? Network analysis is a powerful and general approach to solve this problem consisting of two fundamental stages, network reconstruction, and network interrogation. Here we provide an overview of network analysis including a step-by-step guide on how to perform and use this approach to investigate a biological question. In this guide, we also include the software packages that we and others employ for each of the steps of a network analysis workflow.

Gene network reconstruction reveals cell cycle and antiviral genes as major drivers of cervical cancer.

Mine, Karina L; Shulzhenko, Natalia; Yambartsev, Anatoly; Rochman, Mark; Sanson, Gerdine F O; Lando, Malin; Varma, Sudhir; Skinner, Jeff; Volfovsky, Natalia; Deng, Tao; Brenna, Sylvia M F; Carvalho, Carmen R N; Ribalta, Julisa C L; Bustin, Michael; Matzinger, Polly; Silva, Ismael D C G; Lyng, Heidi; Gerbase-DeLima, Maria; Morgun, Andrey.

Nat Commun ; 4: 1806, 2013.

Artigo em Inglês | MEDLINE | ID: mdl-23651994

RESUMO

Although human papillomavirus was identified as an aetiological factor in cervical cancer, the key human gene drivers of this disease remain unknown. Here we apply an unbiased approach integrating gene expression and chromosomal aberration data. In an independent group of patients, we reconstruct and validate a gene regulatory meta-network, and identify cell cycle and antiviral genes that constitute two major subnetworks upregulated in tumour samples. These genes are located within the same regions as chromosomal amplifications, most frequently on 3q. We propose a model in which selected chromosomal gains drive activation of antiviral genes contributing to episomal virus elimination, which synergizes with cell cycle dysregulation. These findings may help to explain the paradox of episomal human papillomavirus decline in women with invasive cancer who were previously unable to clear the virus.

Assuntos

Antivirais/metabolismo , Ciclo Celular/genética , Redes Reguladoras de Genes/genética , Genes Neoplásicos/genética , Papillomaviridae/genética , Neoplasias do Colo do Útero/genética , Neoplasias do Colo do Útero/virologia , Aberrações Cromossômicas , Cromossomos Humanos/genética , Bases de Dados Genéticas , Feminino , Perfilação da Expressão Gênica , Regulação Neoplásica da Expressão Gênica , Genoma Humano/genética , Instabilidade Genômica , Humanos , Proteínas de Membrana Lisossomal/metabolismo , Metanálise como Assunto , Proteínas de Neoplasias/metabolismo , Infecções por Papillomavirus/genética , Infecções por Papillomavirus/virologia , Reprodutibilidade dos Testes , Neoplasias do Colo do Útero/patologia , Integração Viral/genética

Construct and Compare Gene Coexpression Networks with DAPfinder and DAPview.

Skinner, Jeff; Kotliarov, Yuri; Varma, Sudhir; Mine, Karina L; Yambartsev, Anatoly; Simon, Richard; Huyen, Yentram; Morgun, Andrey.

BMC Bioinformatics ; 12: 286, 2011 Jul 14.

Artigo em Inglês | MEDLINE | ID: mdl-21756334

RESUMO

BACKGROUND: DAPfinder and DAPview are novel BRB-ArrayTools plug-ins to construct gene coexpression networks and identify significant differences in pairwise gene-gene coexpression between two phenotypes. RESULTS: Each significant difference in gene-gene association represents a Differentially Associated Pair (DAP). Our tools include several choices of filtering methods, gene-gene association metrics, statistical testing methods and multiple comparison adjustments. Network results are easily displayed in Cytoscape. Analyses of glioma experiments and microarray simulations demonstrate the utility of these tools. CONCLUSIONS: DAPfinder is a new friendly-user tool for reconstruction and comparison of biological networks.

Assuntos

Perfilação da Expressão Gênica/métodos , Regulação Neoplásica da Expressão Gênica , Redes Reguladoras de Genes , Glioma/genética , Software , Humanos , Análise de Sequência com Séries de Oligonucleotídeos

New approach reveals CD28 and IFNG gene interaction in the susceptibility to cervical cancer.

Guzman, Valeska B; Yambartsev, Anatoly; Goncalves-Primo, Amador; Silva, Ismael D C G; Carvalho, Carmen R N; Ribalta, Julisa C L; Goulart, Luiz Ricardo; Shulzhenko, Natalia; Gerbase-Delima, Maria; Morgun, Andrey.

Hum Mol Genet ; 17(12): 1838-44, 2008 Jun 15.

Artigo em Inglês | MEDLINE | ID: mdl-18337305

RESUMO

Cervical cancer is a complex disease with multiple environmental and genetic determinants. In this study, we sought an association between polymorphisms in immune response genes and cervical cancer using both single-locus and multi-locus analysis approaches. A total of 14 single nucleotide polymorphisms (SNPs) distributed in CD28, CTLA4, ICOS, PDCD1, FAS, TNFA, IL6, IFNG, TGFB1 and IL10 genes were determined in patients and healthy individuals from three independent case/control sets. The first two sets comprised White individuals (one group with 82 cases and 85 controls, the other with 83 cases and 85 controls) and the third was constituted by non-white individuals (64 cases and 75 controls). The multi-locus analysis revealed higher frequencies in cancer patients of three three-genotype combinations [CD28+17(TT)/IFNG+874(AA)/TNFA-308(GG), CD28+17(TT)/IFN+847(AA)/PDCD1+7785(CT), and CD28 +17(TT)/IFNG+874(AA)/ICOS+1564(TT)] (P < 0.01, Monte Carlo simulation). We hypothesized that this two-genotype [CD28(TT) and IFNG(AA)] combination could have a major contribution to the observed association. To address this question, we analyzed the frequency of the CD28(TT), IFNG(AA) genotype combination in the three groups combined, and observed its increase in patients (P = 0.0011 by Fisher's exact test). The contribution of a third polymorphism did not reach statistical significance (P = 0.1). Further analysis suggested that gene-gene interaction between CD28 and IFNG might contribute to susceptibility to cervical cancer. Our results showed an epistatic effect between CD28 and IFNG genes in susceptibility to cervical cancer, a finding that might be relevant for a better understanding of the disease pathogenesis. In addition, the novel analytical approach herein proposed might be useful for increasing the statistical power of future genome-wide multi-locus studies.

Assuntos

Antígenos CD28/genética , Carcinoma de Células Escamosas/genética , Epistasia Genética , Predisposição Genética para Doença , Interferon gama/genética , Neoplasias do Colo do Útero/genética , Brasil , Estudos de Casos e Controles , Feminino , Humanos

Selection of control genes for quantitative RT-PCR based on microarray data.

Shulzhenko, Natalia; Yambartsev, Anatoly; Goncalves-Primo, Amador; Gerbase-DeLima, Maria; Morgun, Andrey.

Biochem Biophys Res Commun ; 337(1): 306-12, 2005 Nov 11.

Artigo em Inglês | MEDLINE | ID: mdl-16182241

RESUMO

Use of internal reference gene(s) is necessary for adequate quantification of target gene expression by RT-PCR. Herein, we elaborated a strategy of control gene selection based on microarray data and illustrated it by analyzing endomyocardial biopsies with acute cardiac rejection and infection. Using order statistics and binomial distribution we evaluated the probability of finding low-varying genes by chance. For analysis, the microarray data were divided into two sample subsets. Among the first 10% of genes with the lowest standard deviations, we found 14 genes common to both subsets. After normalization using two selected genes, high correlation was observed between expression of target genes evaluated by microarray and RT-PCR, and in independent dataset by RT-PCR (r = 0.9, p < 0.001). In conclusion, we showed a simple and reliable strategy of selection and validation of control genes for RT-PCR from microarray data that can be easily applied for different experimental designs and tissues.

Assuntos

Perfilação da Expressão Gênica , Análise de Sequência com Séries de Oligonucleotídeos , Reação em Cadeia da Polimerase Via Transcriptase Reversa/normas , Algoritmos , Doença de Chagas/genética , Doença de Chagas/metabolismo , Interpretação Estatística de Dados , Rejeição de Enxerto/genética , Rejeição de Enxerto/metabolismo , Transplante de Coração , Humanos , Miocárdio/metabolismo , Padrões de Referência

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA