Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 48
Filtrar
1.
BMC Genomics ; 24(1): 349, 2023 Jun 26.
Artigo em Inglês | MEDLINE | ID: mdl-37365517

RESUMO

T cell receptor repertoires can be profiled using next generation sequencing (NGS) to measure and monitor adaptive dynamical changes in response to disease and other perturbations. Genomic DNA-based bulk sequencing is cost-effective but necessitates multiplex target amplification using multiple primer pairs with highly variable amplification efficiencies. Here, we utilize an equimolar primer mixture and propose a single statistical normalization step that efficiently corrects for amplification bias post sequencing. Using samples analyzed by both our open protocol and a commercial solution, we show high concordance between bulk clonality metrics. This approach is an inexpensive and open-source alternative to commercial solutions.


Assuntos
Sequenciamento de Nucleotídeos em Larga Escala , Linfócitos T , Sequência de Bases , Mapeamento Cromossômico , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Receptores de Antígenos de Linfócitos T alfa-beta/genética
2.
Res Sq ; 2023 Feb 16.
Artigo em Inglês | MEDLINE | ID: mdl-36824803

RESUMO

T cell receptor repertoires can be profiled using next generation sequencing (NGS) to measure and monitor adaptive dynamical changes in response to disease and other perturbations. Genomic DNA-based bulk sequencing is cost-effective but necessitates multiplex target amplification using multiple primer pairs with highly variable amplification efficiencies. Here, we utilize an equimolar primer mixture and propose a single statistical normalization step that efficiently corrects for amplification bias post sequencing. Using samples analyzed by both our open protocol and a commercial solution, we show high concordance between bulk clonality metrics. This approach is an inexpensive and open-source alternative to commercial solutions.

3.
Bioinformatics ; 37(13): 1912-1914, 2021 07 27.
Artigo em Inglês | MEDLINE | ID: mdl-33051644

RESUMO

MOTIVATION: Despite widespread prevalence of somatic structural variations (SVs) across most tumor types, understanding of their molecular implications often remains poor. SVs are extremely heterogeneous in size and complexity, hindering the interpretation of their pathogenic role. Tools integrating large SV datasets across platforms are required to fully characterize the cancer's somatic landscape. RESULTS: svpluscnv R package is a swiss army knife for the integration and interpretation of orthogonal datasets including copy number variant segmentation profiles and sequencing-based structural variant calls. The package implements analysis and visualization tools to evaluate chromosomal instability and ploidy, identify genes harboring recurrent SVs and detects complex rearrangements such as chromothripsis and chromoplexia. Further, it allows systematic identification of hot-spot shattered genomic regions, showing reproducibility across alternative detection methods and datasets. AVAILABILITY AND IMPLEMENTATION: https://github.com/ccbiolab/svpluscnv. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Genoma , Genômica , Variações do Número de Cópias de DNA , Variação Estrutural do Genoma , Humanos , Reprodutibilidade dos Testes , Análise de Sequência , Software
4.
Nat Genet ; 52(4): 448-457, 2020 04.
Artigo em Inglês | MEDLINE | ID: mdl-32246132

RESUMO

Precision oncology relies on accurate discovery and interpretation of genomic variants, enabling individualized diagnosis, prognosis and therapy selection. We found that six prominent somatic cancer variant knowledgebases were highly disparate in content, structure and supporting primary literature, impeding consensus when evaluating variants and their relevance in a clinical setting. We developed a framework for harmonizing variant interpretations to produce a meta-knowledgebase of 12,856 aggregate interpretations. We demonstrated large gains in overlap between resources across variants, diseases and drugs as a result of this harmonization. We subsequently demonstrated improved matching between a patient cohort and harmonized interpretations of potential clinical significance, observing an increase from an average of 33% per individual knowledgebase to 57% in aggregate. Our analyses illuminate the need for open, interoperable sharing of variant interpretation data. We also provide a freely available web interface (search.cancervariants.org) for exploring the harmonized interpretations from these six knowledgebases.


Assuntos
Variação Genética/genética , Neoplasias/genética , Bases de Dados Genéticas , Diploide , Genômica/métodos , Humanos , Bases de Conhecimento , Medicina de Precisão/métodos
6.
Genome Biol ; 19(1): 188, 2018 11 06.
Artigo em Inglês | MEDLINE | ID: mdl-30400818

RESUMO

BACKGROUND: The phenotypes of cancer cells are driven in part by somatic structural variants. Structural variants can initiate tumors, enhance their aggressiveness, and provide unique therapeutic opportunities. Whole-genome sequencing of tumors can allow exhaustive identification of the specific structural variants present in an individual cancer, facilitating both clinical diagnostics and the discovery of novel mutagenic mechanisms. A plethora of somatic structural variant detection algorithms have been created to enable these discoveries; however, there are no systematic benchmarks of them. Rigorous performance evaluation of somatic structural variant detection methods has been challenged by the lack of gold standards, extensive resource requirements, and difficulties arising from the need to share personal genomic information. RESULTS: To facilitate structural variant detection algorithm evaluations, we create a robust simulation framework for somatic structural variants by extending the BAMSurgeon algorithm. We then organize and enable a crowdsourced benchmarking within the ICGC-TCGA DREAM Somatic Mutation Calling Challenge (SMC-DNA). We report here the results of structural variant benchmarking on three different tumors, comprising 204 submissions from 15 teams. In addition to ranking methods, we identify characteristic error profiles of individual algorithms and general trends across them. Surprisingly, we find that ensembles of analysis pipelines do not always outperform the best individual method, indicating a need for new ways to aggregate somatic structural variant detection approaches. CONCLUSIONS: The synthetic tumors and somatic structural variant detection leaderboards remain available as a community benchmarking resource, and BAMSurgeon is available at https://github.com/adamewing/bamsurgeon .


Assuntos
Benchmarking , Simulação por Computador , Crowdsourcing , Variação Genética , Genoma Humano , Genômica/métodos , Neoplasias/genética , Algoritmos , Bases de Dados Genéticas , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Software
7.
Artigo em Inglês | MEDLINE | ID: mdl-30283195

RESUMO

Multiplexed imaging such as multicolor immunofluorescence staining, multiplexed immunohistochemistry (mIHC) or cyclic immunofluorescence (cycIF) enables deep assessment of cellular complexity in situ and, in conjunction with standard histology stains like hematoxylin and eosin (H&E), can help to unravel the complex molecular relationships and spatial interdependencies that undergird disease states. However, these multiplexed imaging methods are costly and can degrade both tissue quality and antigenicity with each successive cycle of staining. In addition, computationally intensive image processing such as image registration across multiple channels is required. We have developed a novel method, speedy histopathological-to-immunofluorescent translation (SHIFT) of whole slide images (WSIs) using conditional generative adversarial networks (cGANs). This approach is rooted in the assumption that specific patterns captured in IF images by stains like DAPI, pan-cytokeratin (panCK), or α-smooth muscle actin ( α-SMA) are encoded in H&E images, such that a SHIFT model can learn useful feature representations or architectural patterns in the H&E stain that help generate relevant IF stain patterns. We demonstrate that the proposed method is capable of generating realistic tumor marker IF WSIs conditioned on corresponding H&E-stained WSIs with up to 94.5% accuracy in a matter of seconds. Thus, this method has the potential to not only improve our understanding of the mapping of histological and morphological profiles into protein expression profiles, but also greatly increase the e ciency of diagnostic and prognostic decision-making.

8.
Cancer Cell ; 34(4): 561-578.e6, 2018 10 08.
Artigo em Inglês | MEDLINE | ID: mdl-30300579

RESUMO

Complement is a critical component of humoral immunity implicated in cancer development; however, its biological contributions to tumorigenesis remain poorly understood. Using the K14-HPV16 transgenic mouse model of squamous carcinogenesis, we report that urokinase (uPA)+ macrophages regulate C3-independent release of C5a during premalignant progression, which in turn regulates protumorigenic properties of C5aR1+ mast cells and macrophages, including suppression of CD8+ T cell cytotoxicity. Therapeutic inhibition of C5aR1 via the peptide antagonist PMX-53 improved efficacy of paclitaxel chemotherapy associated with increased presence and cytotoxic properties of CXCR3+ effector memory CD8+ T cells in carcinomas, dependent on both macrophage transcriptional programming and IFNγ. Together, these data identify C5aR1-dependent signaling as an important immunomodulatory program in neoplastic tissue tractable for combinatorial cancer immunotherapy.


Assuntos
Carcinogênese/efeitos dos fármacos , Complemento C5a/efeitos dos fármacos , Tratamento Farmacológico , Receptor da Anafilatoxina C5a/efeitos dos fármacos , Animais , Linfócitos T CD8-Positivos/efeitos dos fármacos , Carcinoma de Células Escamosas/tratamento farmacológico , Modelos Animais de Doenças , Tratamento Farmacológico/métodos , Humanos , Macrófagos/efeitos dos fármacos , Macrófagos/fisiologia , Camundongos , Transdução de Sinais/efeitos dos fármacos
9.
BMC Bioinformatics ; 19(1): 339, 2018 Sep 25.
Artigo em Inglês | MEDLINE | ID: mdl-30253747

RESUMO

BACKGROUND: Platform-specific error profiles necessitate confirmatory studies where predictions made on data generated using one technology are additionally verified by processing the same samples on an orthogonal technology. However, verifying all predictions can be costly and redundant, and testing a subset of findings is often used to estimate the true error profile. RESULTS: To determine how to create subsets of predictions for validation that maximize accuracy of global error profile inference, we developed Valection, a software program that implements multiple strategies for the selection of verification candidates. We evaluated these selection strategies on one simulated and two experimental datasets. CONCLUSIONS: Valection is implemented in multiple programming languages, available at: http://labs.oicr.on.ca/boutros-lab/software/valection.


Assuntos
Análise de Sequência de DNA/métodos , Validação de Programas de Computador
10.
Proc Natl Acad Sci U S A ; 115(21): 5462-5467, 2018 05 22.
Artigo em Inglês | MEDLINE | ID: mdl-29735700

RESUMO

The Fbw7 (F-box/WD repeat-containing protein 7) ubiquitin ligase targets multiple oncoproteins for degradation and is commonly mutated in cancers. Like other pleiotropic tumor suppressors, Fbw7's complex biology has impeded our understanding of how Fbw7 mutations promote tumorigenesis and hindered the development of targeted therapies. To address these needs, we employed a transfer learning approach to derive gene-expression signatures from The Cancer Gene Atlas datasets that predict Fbw7 mutational status across tumor types and identified the pathways enriched within these signatures. Genes involved in mitochondrial function were highly enriched in pan-cancer signatures that predict Fbw7 mutations. Studies in isogenic colorectal cancer cell lines that differed in Fbw7 mutational status confirmed that Fbw7 mutations increase mitochondrial gene expression. Surprisingly, Fbw7 mutations shifted cellular metabolism toward oxidative phosphorylation and caused context-specific metabolic vulnerabilities. Our approach revealed unexpected metabolic reprogramming and possible therapeutic targets in Fbw7-mutant cancers and provides a framework to study other complex, oncogenic mutations.


Assuntos
Neoplasias Colorretais/metabolismo , Neoplasias Colorretais/patologia , Proteína 7 com Repetições F-Box-WD/genética , Proteína 7 com Repetições F-Box-WD/metabolismo , Metaboloma , Mitocôndrias/metabolismo , Mutação , Respiração Celular , Neoplasias Colorretais/genética , Perfilação da Expressão Gênica , Humanos , Mitocôndrias/patologia , Fosforilação Oxidativa , Estresse Oxidativo , Fosforilação , Ubiquitina , Ubiquitinação
11.
Clin Cancer Res ; 24(12): 2828-2843, 2018 06 15.
Artigo em Inglês | MEDLINE | ID: mdl-29599409

RESUMO

Purpose: Head and neck squamous cell carcinoma (HNSCC) is the sixth most common cancer worldwide, with high mortality and a lack of targeted therapies. To identify and prioritize druggable targets, we performed genome analysis together with genome-scale siRNA and oncology drug profiling using low-passage tumor cells derived from a patient with treatment-resistant HPV-negative HNSCC.Experimental Design: A tumor cell culture was established and subjected to whole-exome sequencing, RNA sequencing, comparative genome hybridization, and high-throughput phenotyping with a siRNA library covering the druggable genome and an oncology drug library. Secondary screens of candidate target genes were performed on the primary tumor cells and two nontumorigenic keratinocyte cell cultures for validation and to assess cancer specificity. siRNA screens of the kinome on two isogenic pairs of p53-mutated HNSCC cell lines were used to determine generalizability. Clinical utility was addressed by performing drug screens on two additional HNSCC cell cultures derived from patients enrolled in a clinical trial.Results: Many of the identified copy number aberrations and somatic mutations in the primary tumor were typical of HPV(-) HNSCC, but none pointed to obvious therapeutic choices. In contrast, siRNA profiling identified 391 candidate target genes, 35 of which were preferentially lethal to cancer cells, most of which were not genomically altered. Chemotherapies and targeted agents with strong tumor-specific activities corroborated the siRNA profiling results and included drugs that targeted the mitotic spindle, the proteasome, and G2-M kinases WEE1 and CHK1 We also show the feasibility of ex vivo drug profiling for patients enrolled in a clinical trial.Conclusions: High-throughput phenotyping with siRNA and drug libraries using patient-derived tumor cells prioritizes mutated driver genes and identifies novel drug targets not revealed by genomic profiling. Functional profiling is a promising adjunct to DNA sequencing for precision oncology. Clin Cancer Res; 24(12); 2828-43. ©2018 AACR.


Assuntos
Biomarcadores Tumorais , Neoplasias de Cabeça e Pescoço/tratamento farmacológico , Terapia de Alvo Molecular , Medicina de Precisão , Antineoplásicos/farmacologia , Antineoplásicos/uso terapêutico , Biomarcadores Tumorais/antagonistas & inibidores , Biomarcadores Tumorais/genética , Hibridização Genômica Comparativa , Biologia Computacional/métodos , Perfilação da Expressão Gênica , Genômica/métodos , Neoplasias de Cabeça e Pescoço/diagnóstico , Neoplasias de Cabeça e Pescoço/genética , Humanos , Masculino , Pessoa de Meia-Idade , Terapia de Alvo Molecular/métodos , Mutação , Tomografia por Emissão de Pósitrons , Medicina de Precisão/métodos , RNA Interferente Pequeno/genética , Tomografia Computadorizada por Raios X , Transcriptoma , Sequenciamento do Exoma
12.
BMC Bioinformatics ; 19(1): 28, 2018 01 31.
Artigo em Inglês | MEDLINE | ID: mdl-29385983

RESUMO

BACKGROUND: The clinical sequencing of cancer genomes to personalize therapy is becoming routine across the world. However, concerns over patient re-identification from these data lead to questions about how tightly access should be controlled. It is not thought to be possible to re-identify patients from somatic variant data. However, somatic variant detection pipelines can mistakenly identify germline variants as somatic ones, a process called "germline leakage". The rate of germline leakage across different somatic variant detection pipelines is not well-understood, and it is uncertain whether or not somatic variant calls should be considered re-identifiable. To fill this gap, we quantified germline leakage across 259 sets of whole-genome somatic single nucleotide variant (SNVs) predictions made by 21 teams as part of the ICGC-TCGA DREAM Somatic Mutation Calling Challenge. RESULTS: The median somatic SNV prediction set contained 4325 somatic SNVs and leaked one germline polymorphism. The level of germline leakage was inversely correlated with somatic SNV prediction accuracy and positively correlated with the amount of infiltrating normal cells. The specific germline variants leaked differed by tumour and algorithm. To aid in quantitation and correction of leakage, we created a tool, called GermlineFilter, for use in public-facing somatic SNV databases. CONCLUSIONS: The potential for patient re-identification from leaked germline variants in somatic SNV predictions has led to divergent open data access policies, based on different assessments of the risks. Indeed, a single, well-publicized re-identification event could reshape public perceptions of the values of genomic data sharing. We find that modern somatic SNV prediction pipelines have low germline-leakage rates, which can be further reduced, especially for cloud-sharing, using pre-filtering software.


Assuntos
Genoma Humano , Células Germinativas/metabolismo , Polimorfismo de Nucleotídeo Único , Algoritmos , Humanos , Internet , Neoplasias/genética , Neoplasias/patologia , Interface Usuário-Computador , Sequenciamento Completo do Genoma
13.
Cell Syst ; 5(5): 485-497.e3, 2017 11 22.
Artigo em Inglês | MEDLINE | ID: mdl-28988802

RESUMO

We report the results of a DREAM challenge designed to predict relative genetic essentialities based on a novel dataset testing 98,000 shRNAs against 149 molecularly characterized cancer cell lines. We analyzed the results of over 3,000 submissions over a period of 4 months. We found that algorithms combining essentiality data across multiple genes demonstrated increased accuracy; gene expression was the most informative molecular data type; the identity of the gene being predicted was far more important than the modeling strategy; well-predicted genes and selected molecular features showed enrichment in functional categories; and frequently selected expression features correlated with survival in primary tumors. This study establishes benchmarks for gene essentiality prediction, presents a community resource for future comparison with this benchmark, and provides insights into factors influencing the ability to predict gene essentiality from functional genetic screens. This study also demonstrates the value of releasing pre-publication data publicly to engage the community in an open research collaboration.


Assuntos
Expressão Gênica/genética , Genes Essenciais/genética , Algoritmos , Linhagem Celular Tumoral , Genômica/métodos , Humanos , RNA Interferente Pequeno/genética
14.
Cell Rep ; 19(1): 203-217, 2017 04 04.
Artigo em Inglês | MEDLINE | ID: mdl-28380359

RESUMO

Here, we describe a multiplexed immunohistochemical platform with computational image processing workflows, including image cytometry, enabling simultaneous evaluation of 12 biomarkers in one formalin-fixed paraffin-embedded tissue section. To validate this platform, we used tissue microarrays containing 38 archival head and neck squamous cell carcinomas and revealed differential immune profiles based on lymphoid and myeloid cell densities, correlating with human papilloma virus status and prognosis. Based on these results, we investigated 24 pancreatic ductal adenocarcinomas from patients who received neoadjuvant GVAX vaccination and revealed that response to therapy correlated with degree of mono-myelocytic cell density and percentages of CD8+ T cells expressing T cell exhaustion markers. These data highlight the utility of in situ immune monitoring for patient stratification and provide digital image processing pipelines to the community for examining immune complexity in precious tissue sections, where phenotype and tissue architecture are preserved to improve biomarker discovery and assessment.


Assuntos
Biomarcadores Tumorais/análise , Carcinoma de Células Escamosas/imunologia , Neoplasias de Cabeça e Pescoço/imunologia , Citometria por Imagem/métodos , Processamento de Imagem Assistida por Computador , Monitorização Imunológica/métodos , Idoso , Idoso de 80 Anos ou mais , Biomarcadores Tumorais/metabolismo , Estudos de Coortes , Feminino , Humanos , Imuno-Histoquímica , Masculino , Pessoa de Meia-Idade , Prognóstico , Estatísticas não Paramétricas , Análise Serial de Tecidos
16.
Bioinformatics ; 33(9): 1362-1369, 2017 05 01.
Artigo em Inglês | MEDLINE | ID: mdl-28082455

RESUMO

Motivation: In recent years, vast advances in biomedical technologies and comprehensive sequencing have revealed the genomic landscape of common forms of human cancer in unprecedented detail. The broad heterogeneity of the disease calls for rapid development of personalized therapies. Translating the readily available genomic data into useful knowledge that can be applied in the clinic remains a challenge. Computational methods are needed to aid these efforts by robustly analyzing genome-scale data from distinct experimental platforms for prioritization of targets and treatments. Results: We propose a novel, biologically motivated, Bayesian multitask approach, which explicitly models gene-centric dependencies across multiple and distinct genomic platforms. We introduce a gene-wise prior and present a fully Bayesian formulation of a group factor analysis model. In supervised prediction applications, our multitask approach leverages similarities in response profiles of groups of drugs that are more likely to be related to true biological signal, which leads to more robust performance and improved generalization ability. We evaluate the performance of our method on molecularly characterized collections of cell lines profiled against two compound panels, namely the Cancer Cell Line Encyclopedia and the Cancer Therapeutics Response Portal. We demonstrate that accounting for the gene-centric dependencies enables leveraging information from multi-omic input data and improves prediction and feature selection performance. We further demonstrate the applicability of our method in an unsupervised dimensionality reduction application by inferring genes essential to tumorigenesis in the pancreatic ductal adenocarcinoma and lung adenocarcinoma patient cohorts from The Cancer Genome Atlas. Availability and Implementation: : The code for this work is available at https://github.com/olganikolova/gbgfa. Contact: : nikolova@ohsu.edu or margolin@ohsu.edu. Supplementary information: Supplementary data are available at Bioinformatics online.


Assuntos
Biomarcadores Farmacológicos , Genes Neoplásicos , Genômica/métodos , Modelos Genéticos , Neoplasias/metabolismo , Medicina de Precisão/métodos , Adenocarcinoma/tratamento farmacológico , Adenocarcinoma/genética , Adenocarcinoma/metabolismo , Antineoplásicos/uso terapêutico , Teorema de Bayes , Linhagem Celular , Transformação Celular Neoplásica , Humanos , Neoplasias Pulmonares/tratamento farmacológico , Neoplasias Pulmonares/genética , Neoplasias Pulmonares/metabolismo , Neoplasias/tratamento farmacológico , Neoplasias/genética , Neoplasias Pancreáticas/tratamento farmacológico , Neoplasias Pancreáticas/genética , Neoplasias Pancreáticas/metabolismo , Aprendizado de Máquina não Supervisionado
17.
Nat Cell Biol ; 17(12): 1523-35, 2015 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-26571212

RESUMO

For nearly a century developmental biologists have recognized that cells from embryos can differ in their potential to differentiate into distinct cell types. Recently, it has been recognized that embryonic stem cells derived from both mice and humans exhibit two stable yet epigenetically distinct states of pluripotency: naive and primed. We now show that nicotinamide N-methyltransferase (NNMT) and the metabolic state regulate pluripotency in human embryonic stem cells (hESCs).  Specifically, in naive hESCs, NNMT and its enzymatic product 1-methylnicotinamide are highly upregulated, and NNMT is required for low S-adenosyl methionine (SAM) levels and the H3K27me3 repressive state. NNMT consumes SAM in naive cells, making it unavailable for histone methylation that represses Wnt and activates the HIF pathway in primed hESCs. These data support the hypothesis that the metabolome regulates the epigenetic landscape of the earliest steps in human development.


Assuntos
Diferenciação Celular , Epigênese Genética/genética , Células-Tronco Embrionárias Humanas/metabolismo , Metaboloma , Animais , Western Blotting , Células Cultivadas , Células-Tronco Embrionárias/metabolismo , Cromatografia Gasosa-Espectrometria de Massas , Perfilação da Expressão Gênica/métodos , Técnicas de Silenciamento de Genes , Histonas/metabolismo , Humanos , Lisina/metabolismo , Espectrometria de Massas , Metabolômica/métodos , Metilação , Camundongos , Niacinamida/análogos & derivados , Niacinamida/metabolismo , Nicotinamida N-Metiltransferase/genética , Nicotinamida N-Metiltransferase/metabolismo , Proteômica/métodos , Reação em Cadeia da Polimerase Via Transcriptase Reversa , S-Adenosilmetionina/metabolismo , Transdução de Sinais
18.
J Am Med Inform Assoc ; 22(6): 1143-7, 2015 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-26174866

RESUMO

The world's genomics data will never be stored in a single repository - rather, it will be distributed among many sites in many countries. No one site will have enough data to explain genotype to phenotype relationships in rare diseases; therefore, sites must share data. To accomplish this, the genetics community must forge common standards and protocols to make sharing and computing data among many sites a seamless activity. Through the Global Alliance for Genomics and Health, we are pioneering the development of shared application programming interfaces (APIs) to connect the world's genome repositories. In parallel, we are developing an open source software stack (ADAM) that uses these APIs. This combination will create a cohesive genome informatics ecosystem. Using containers, we are facilitating the deployment of this software in a diverse array of environments. Through benchmarking efforts and big data driver projects, we are ensuring ADAM's performance and utility.


Assuntos
Conjuntos de Dados como Assunto , Genômica , Pesquisa Translacional Biomédica , Biologia Computacional , Humanos , Bases de Conhecimento , National Institutes of Health (U.S.) , Estados Unidos
19.
Nat Methods ; 12(7): 623-30, 2015 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-25984700

RESUMO

The detection of somatic mutations from cancer genome sequences is key to understanding the genetic basis of disease progression, patient survival and response to therapy. Benchmarking is needed for tool assessment and improvement but is complicated by a lack of gold standards, by extensive resource requirements and by difficulties in sharing personal genomic information. To resolve these issues, we launched the ICGC-TCGA DREAM Somatic Mutation Calling Challenge, a crowdsourced benchmark of somatic mutation detection algorithms. Here we report the BAMSurgeon tool for simulating cancer genomes and the results of 248 analyses of three in silico tumors created with it. Different algorithms exhibit characteristic error profiles, and, intriguingly, false positives show a trinucleotide profile very similar to one found in human tumors. Although the three simulated tumors differ in sequence contamination (deviation from normal cell sequence) and in subclonality, an ensemble of pipelines outperforms the best individual pipeline in all cases. BAMSurgeon is available at https://github.com/adamewing/bamsurgeon/.


Assuntos
Benchmarking , Crowdsourcing , Genoma , Neoplasias/genética , Polimorfismo de Nucleotídeo Único , Algoritmos , Humanos
20.
Pac Symp Biocomput ; : 32-43, 2015.
Artigo em Inglês | MEDLINE | ID: mdl-25592566

RESUMO

Complex mechanisms involving genomic aberrations in numerous proteins and pathways are believed to be a key cause of many diseases such as cancer. With recent advances in genomics, elucidating the molecular basis of cancer at a patient level is now feasible, and has led to personalized treatment strategies whereby a patient is treated according to his or her genomic profile. However, there is growing recognition that existing treatment modalities are overly simplistic, and do not fully account for the deep genomic complexity associated with sensitivity or resistance to cancer therapies. To overcome these limitations, large-scale pharmacogenomic screens of cancer cell lines--in conjunction with modern statistical learning approaches--have been used to explore the genetic underpinnings of drug response. While these analyses have demonstrated the ability to infer genetic predictors of compound sensitivity, to date most modeling approaches have been data-driven, i.e. they do not explicitly incorporate domain-specific knowledge (priors) in the process of learning a model. While a purely data-driven approach offers an unbiased perspective of the data--and may yield unexpected or novel insights--this strategy introduces challenges for both model interpretability and accuracy. In this study, we propose a novel prior-incorporated sparse regression model in which the choice of informative predictor sets is carried out by knowledge-driven priors (gene sets) in a stepwise fashion. Under regularization in a linear regression model, our algorithm is able to incorporate prior biological knowledge across the predictive variables thereby improving the interpretability of the final model with no loss--and often an improvement--in predictive performance. We evaluate the performance of our algorithm compared to well-known regularization methods such as LASSO, Ridge and Elastic net regression in the Cancer Cell Line Encyclopedia (CCLE) and Genomics of Drug Sensitivity in Cancer (Sanger) pharmacogenomics datasets, demonstrating that incorporation of the biological priors selected by our model confers improved predictability and interpretability, despite much fewer predictors, over existing state-of-the-art methods.


Assuntos
Modelos Lineares , Farmacogenética/estatística & dados numéricos , Algoritmos , Linhagem Celular Tumoral , Biologia Computacional , Bases de Dados Genéticas , Ensaios de Seleção de Medicamentos Antitumorais/estatística & dados numéricos , Humanos , Modelos Genéticos , Neoplasias/tratamento farmacológico , Neoplasias/genética
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...