Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 9 de 9
Filter
1.
Bioinformatics ; 38(10): 2956-2958, 2022 05 13.
Article in English | MEDLINE | ID: mdl-35561193

ABSTRACT

SUMMARY: This article presents multi-omic integration with sparse value decomposition (MOSS), a free and open-source R package for integration and feature selection in multiple large omics datasets. This package is computationally efficient and offers biological insight through capabilities, such as cluster analysis and identification of informative omic features. AVAILABILITY AND IMPLEMENTATION: https://CRAN.R-project.org/package=MOSS. SUPPLEMENTARY INFORMATION: Supplementary information can be found at https://github.com/agugonrey/GonzalezReymundez2021.


Subject(s)
Software , Cluster Analysis
2.
PLoS One ; 15(12): e0243251, 2020.
Article in English | MEDLINE | ID: mdl-33315963

ABSTRACT

Modern genomic data sets often involve multiple data-layers (e.g., DNA-sequence, gene expression), each of which itself can be high-dimensional. The biological processes underlying these data-layers can lead to intricate multivariate association patterns. We propose and evaluate two methods to determine the proportion of variance of an output data set that can be explained by an input data set when both data panels are high dimensional. Our approach uses random-effects models to estimate the proportion of variance of vectors in the linear span of the output set that can be explained by regression on the input set. We consider a method based on an orthogonal basis (Eigen-ANOVA) and one that uses random vectors (Monte Carlo ANOVA, MC-ANOVA) in the linear span of the output set. Using simulations, we show that the MC-ANOVA method gave nearly unbiased estimates. Estimates produced by Eigen-ANOVA were also nearly unbiased, except when the shared variance was very high (e.g., >0.9). We demonstrate the potential insight that can be obtained from the use of MC-ANOVA and Eigen-ANOVA by applying these two methods to the study of multi-locus linkage disequilibrium in chicken (Gallus gallus) genomes and to the assessment of inter-dependencies between gene expression, methylation, and copy-number-variants in data from breast cancer tumors from humans (Homo sapiens). Our analyses reveal that in chicken breeding populations ~50,000 evenly-spaced SNPs are enough to fully capture the span of whole-genome-sequencing genomes. In the study of multi-omic breast cancer data, we found that the span of copy-number-variants can be fully explained using either methylation or gene expression data and that roughly 74% of the variance in gene expression can be predicted from methylation data.


Subject(s)
Genomics/methods , Analysis of Variance , Animals , Breast Neoplasms/genetics , Chickens/genetics , DNA Copy Number Variations , DNA Methylation , Female , Gene Expression Regulation, Neoplastic , Humans , Linkage Disequilibrium , Monte Carlo Method , Polymorphism, Single Nucleotide , Whole Genome Sequencing
3.
Cancer Invest ; 38(8-9): 502-506, 2020 Sep.
Article in English | MEDLINE | ID: mdl-32935594

ABSTRACT

Pancreatic cancer (PC) is associated with a high mortality rate. We explored the interindividual variation of cancer outcomes, attributable to DNA methylation, gene expression, and clinical factors among PC patients. We aim to determine whether we could differentiate subjects with greater nodal involvement, higher cancer staging, and subsequent survival. We modeled every response variable as a function of a linear predictor involving the effects of clinical variables, methylation, and gene expression in a Bayesian framework. Our results highlight the overall importance of wide-spread alterations in methylation and gene expression patterns associated with survival, nodal metastasis, and staging.


Subject(s)
Carcinoma, Pancreatic Ductal/genetics , DNA Methylation , Pancreatic Neoplasms/genetics , Bayes Theorem , Carcinoma, Pancreatic Ductal/mortality , Carcinoma, Pancreatic Ductal/pathology , Humans , Lymph Nodes/pathology , Lymphatic Metastasis , Models, Statistical , Neoplasm Staging , Pancreatic Neoplasms/mortality , Pancreatic Neoplasms/pathology , Survival Analysis , Transcriptome
4.
Sci Rep ; 10(1): 8341, 2020 05 20.
Article in English | MEDLINE | ID: mdl-32433524

ABSTRACT

Despite recent advances in treatment, cancer continues to be one of the most lethal human maladies. One of the challenges of cancer treatment is the diversity among similar tumors that exhibit different clinical outcomes. Most of this variability comes from wide-spread molecular alterations that can be summarized by omic integration. Here, we have identified eight novel tumor groups (C1-8) via omic integration, characterized by unique cancer signatures and clinical characteristics. C3 had the best clinical outcomes, while C2 and C5 had poorest. C1, C7, and C8 were upregulated for cellular and mitochondrial translation, and relatively low proliferation. C6 and C4 were also downregulated for cellular and mitochondrial translation, and had high proliferation rates. C4 was represented by copy losses on chromosome 6, and had the highest number of metastatic samples. C8 was characterized by copy losses on chromosome 11, having also the lowest lymphocytic infiltration rate. C6 had the lowest natural killer infiltration rate and was represented by copy gains of genes in chromosome 11. C7 was represented by copy gains on chromosome 6, and had the highest upregulation in mitochondrial translation. We believe that, since molecularly alike tumors could respond similarly to treatment, our results could inform therapeutic action.


Subject(s)
Biomarkers, Tumor/genetics , Cluster Analysis , Gene Expression Regulation, Neoplastic , Genomics/methods , Neoplasms/diagnosis , Cell Proliferation/genetics , DNA Copy Number Variations , DNA Methylation , Datasets as Topic , Down-Regulation , Female , Gene Expression Profiling , Humans , Male , Middle Aged , Neoplasms/genetics , Neoplasms/mortality , Neoplasms/therapy , Up-Regulation , Exome Sequencing
5.
PLoS One ; 15(2): e0228957, 2020.
Article in English | MEDLINE | ID: mdl-32078659

ABSTRACT

Breast cancer is the leading cause of cancer-related disease in women. Cumulative evidence supports a causal role of alcohol intake and breast cancer incidence. In this study, we explore the change on expression of genes involved in the biological pathways through which alcohol has been hypothesized to impact breast cancer risk, to shed new insights on possible mechanisms affecting the survival of breast cancer patients. Here, we performed differential expression analysis at individual genes and gene set levels, respectively, across survival and breast cancer subtype data. Information about postdiagnosis breast cancer survival was obtained from 1977 Caucasian female participants in the Molecular Taxonomy of Breast Cancer International Consortium. Expression of 16 genes that have been linked in the literature to the hypothesized alcohol-breast cancer pathways, were examined. We found that the expression of 9 out of 16 genes under study were associated with cancer survival within the first 4 years of diagnosis. Results from gene set analysis confirmed a significant differential expression of these genes as a whole too. Although alcohol consumption is not analyzed, nor available for this dataset, we believe that further study on these genes could provide important information for clinical recommendations about potential impact of alcohol drinking on breast cancer survival.


Subject(s)
Alcohol Drinking/genetics , Breast Neoplasms/genetics , Gene Expression Regulation, Neoplastic/genetics , Adult , Aged , Alcohol Drinking/epidemiology , Alcohol Drinking/mortality , Breast/pathology , Breast Neoplasms/diagnosis , Breast Neoplasms/mortality , Ethanol , Female , Humans , Incidence , Middle Aged , Risk Assessment/methods , Risk Factors
6.
G3 (Bethesda) ; 8(11): 3627-3636, 2018 11 06.
Article in English | MEDLINE | ID: mdl-30228192

ABSTRACT

Glioblastoma multiforme (GBM) has been recognized as the most lethal type of malignant brain tumor. Despite efforts of the medical and research community, patients' survival remains extremely low. Multi-omic profiles (including DNA sequence, methylation and gene expression) provide rich information about the tumor. These profiles are likely to reveal processes that may be predictive of patient survival. However, the integration of multi-omic profiles, which are high dimensional and heterogeneous in nature, poses great challenges. The goal of this work was to develop models for prediction of survival of GBM patients that can integrate clinical information and multi-omic profiles, using multi-layered Bayesian regressions. We apply the methodology to data from GBM patients from The Cancer Genome Atlas (TCGA, n = 501) to evaluate whether integrating multi-omic profiles (SNP-genotypes, methylation, copy number variants and gene expression) with clinical information (demographics as well as treatments) leads to an improved ability to predict patient survival. The proposed Bayesian models were used to estimate the proportion of variance explained by clinical covariates and omics and to evaluate prediction accuracy in cross validation (using the area under the Receiver Operating Characteristic curve, AUC). Among clinical and demographic covariates, age (AUC = 0.664) and the use of temozolomide (AUC = 0.606) were the most predictive of survival. Among omics, methylation (AUC = 0.623) and gene expression (AUC = 0.593) were more predictive than either SNP (AUC = 0.539) or CNV (AUC = 0.547). While there was a clear association between age and methylation, the integration of age, the use of temozolomide, and either gene expression or methylation led to a substantial increase in AUC in cross-validaton (AUC = 0.718). Finally, among the genes whose methylation was higher in aging brains, we observed a higher enrichment of these genes being also differentially methylated in cancer.


Subject(s)
Brain Neoplasms/genetics , Glioblastoma/genetics , Aged , Antineoplastic Agents, Alkylating/therapeutic use , Brain Neoplasms/drug therapy , DNA Copy Number Variations , DNA Methylation , Female , Genomics , Glioblastoma/drug therapy , Humans , Male , Middle Aged , Prognosis , Survival Analysis , Temozolomide/therapeutic use
7.
Oncotarget ; 9(96): 36836-36848, 2018 Dec 07.
Article in English | MEDLINE | ID: mdl-30627325

ABSTRACT

BACKGROUND: Lymph node metastasis (NM) in breast cancer is a clinical predictor of patient outcomes, but how its genetic underpinnings contribute to aggressive phenotypes is unclear. Our objective was to create the first landscape analysis of CNV-associated NM in ductal breast cancer. To assess the role of copy number variations (CNVs) in NM, we compared CNVs and/or associated mRNA expression in primary tumors of patients with NM to those without metastasis. RESULTS: We found CNV loss in chromosomes 1, 3, 9, 18, and 19 and gains in chromosomes 5, 8, 12, 14, 16-17, and 20 that were associated with NM and replicated in both databases. In primary tumors, per-gene CNVs associated with NM were ten times more frequent than mRNA expression; however, there were few CNV-driven changes in mRNA expression that differed by nodal status. Overlapping regions of CNV changes and mRNA expression were evident for the CTAGE5 gene. In 8q12, 11q13-14, 20q1, and 17q14-24 regions, there were gene-specific gains in CNV-driven mRNA expression associated with NM. METHODS: Data on CNV and mRNA expression from the TCGA and the METABRIC consortium of breast ductal carcinoma were utilized to identify CNV-based features associated with NM. Within each dataset, associations were compared across omic platforms to identify CNV-driven variations in gene expression. Only replications across both datasets were considered as determinants of NM. CONCLUSIONS: Gains in CTAGE5, NDUFC2, EIF4EBP1, and PSCA genes and their expression may aid in early diagnosis of metastatic breast carcinoma and have potential as therapeutic targets.

8.
Eur J Hum Genet ; 25(5): 538-544, 2017 05.
Article in English | MEDLINE | ID: mdl-28272536

ABSTRACT

Breast cancer (BC) is the second most common type of cancer and a major cause of death for women. Commonly, BC patients are assigned to risk groups based on the combination of prognostic and prediction factors (eg, patient age, tumor size, tumor grade, hormone receptor status, etc). Although this approach is able to identify risk groups with different prognosis, patients are highly heterogeneous in their response to treatments. To improve the prediction of BC patients, we extended clinical models (including prognostic and prediction factors with whole-omic data) to integrate omics profiles for gene expression and copy number variants (CNVs). We describe a modeling framework that is able to incorporate clinical risk factors, high-dimensional omics profiles, and interactions between omics and non-omic factors (eg, treatment). We used the proposed modeling framework and data from METABRIC (Molecular Taxonomy of Breast Cancer Consortium) to assess the impact on the accuracy of BC patient survival predictions when omics and omic-by-treatment interactions are being considered. Our analysis shows that omics and omic-by-treatment interactions explain a sizable fraction of the variance on survival time that is not explained by commonly used clinical covariates. The sizable interaction effects observed, together with the increase in prediction accuracy, suggest that whole-omic profiles could be used to improve prognosis prediction among BC patients.


Subject(s)
Breast Neoplasms/epidemiology , Computational Biology , Models, Statistical , Pharmacogenetics , Breast Neoplasms/diagnosis , Breast Neoplasms/drug therapy , Breast Neoplasms/genetics , Female , Humans , Prognosis
9.
BMC Genomics ; 17(1): 773, 2016 10 04.
Article in English | MEDLINE | ID: mdl-27716058

ABSTRACT

BACKGROUND: Whole-genome genotyping techniques like Genotyping-by-sequencing (GBS) are being used for genetic studies such as Genome-Wide Association (GWAS) and Genomewide Selection (GS), where different strategies for imputation have been developed. Nevertheless, imputation error may lead to poor performance (i.e. smaller power or higher false positive rate) when complete data is not required as it is for GWAS, and each marker is taken at a time. The aim of this study was to compare the performance of GWAS analysis for Quantitative Trait Loci (QTL) of major and minor effect using different imputation methods when no reference panel is available in a wheat GBS panel. RESULTS: In this study, we compared the power and false positive rate of dissecting quantitative traits for imputed and not-imputed marker score matrices in: (1) a complete molecular marker barley panel array, and (2) a GBS wheat panel with missing data. We found that there is an ascertainment bias in imputation method comparisons. Simulating over a complete matrix and creating missing data at random proved that imputation methods have a poorer performance. Furthermore, we found that when QTL were simulated with imputed data, the imputation methods performed better than the not-imputed ones. On the other hand, when QTL were simulated with not-imputed data, the not-imputed method and one of the imputation methods performed better for dissecting quantitative traits. Moreover, larger differences between imputation methods were detected for QTL of major effect than QTL of minor effect. We also compared the different marker score matrices for GWAS analysis in a real wheat phenotype dataset, and we found minimal differences indicating that imputation did not improve the GWAS performance when a reference panel was not available. CONCLUSIONS: Poorer performance was found in GWAS analysis when an imputed marker score matrix was used, no reference panel is available, in a wheat GBS panel.


Subject(s)
Genome, Plant , Genomics , Triticum/genetics , Genome-Wide Association Study , Genomics/methods , High-Throughput Nucleotide Sequencing , Inheritance Patterns , Phenotype , Quantitative Trait Loci , Reproducibility of Results
SELECTION OF CITATIONS
SEARCH DETAIL
...