Pesquisa | Portal Regional da BVS (teste)

1.

nhanesA: achieving transparency and reproducibility in NHANES research.

Ale, Laha; Gentleman, Robert; Sonmez, Teresa Filshtein; Sarkar, Deepayan; Endres, Christopher.

Database (Oxford) ; 20242024 Apr 15.

Artigo em Inglês | MEDLINE | ID: mdl-38625809

RESUMO

The National Health and Nutrition Examination Survey provides comprehensive data on demographics, sociology, health and nutrition. Conducted in 2-year cycles since 1999, most of its data are publicly accessible, making it pivotal for research areas like studying social determinants of health or tracking trends in health metrics such as obesity or diabetes. Assembling the data and analyzing it presents a number of technical and analytic challenges. This paper introduces the nhanesA R package, which is designed to assist researchers in data retrieval and analysis and to enable the sharing and extension of prior research efforts. We believe that fostering community-driven activity in data reproducibility and sharing of analytic methods will greatly benefit the scientific community and propel scientific advancements. Database URL: https://github.com/cjendres1/nhanes.

Assuntos

Armazenamento e Recuperação da Informação , Inquéritos Nutricionais , Reprodutibilidade dos Testes , Bases de Dados Factuais

2.

BioPlexR and BioPlexPy: integrated data products for the analysis of human protein interactions.

Geistlinger, Ludwig; Vargas, Roger; Lee, Tyrone; Pan, Joshua; Huttlin, Edward L; Gentleman, Robert.

Bioinformatics ; 39(3)2023 03 01.

Artigo em Inglês | MEDLINE | ID: mdl-36794911

RESUMO

SUMMARY: The BioPlex project has created two proteome scale, cell-line-specific protein-protein interaction (PPI) networks: the first in 293T cells, including 120k interactions among 15k proteins; and the second in HCT116 cells, including 70k interactions between 10k proteins. Here, we describe programmatic access to the BioPlex PPI networks and integration with related resources from within R and Python. Besides PPI networks for 293T and HCT116 cells, this includes access to CORUM protein complex data, PFAM protein domain data, PDB protein structures, and transcriptome and proteome data for the two cell lines. The implemented functionality serves as a basis for integrative downstream analysis of BioPlex PPI data with domain-specific R and Python packages, including efficient execution of maximum scoring sub-network analysis, protein domain-domain association analysis, mapping of PPIs onto 3D protein structures and analysis of BioPlex PPIs at the interface of transcriptomic and proteomic data. AVAILABILITY AND IMPLEMENTATION: The BioPlex R package is available from Bioconductor (bioconductor.org/packages/BioPlex), and the BioPlex Python package is available from PyPI (pypi.org/project/bioplexpy). Applications and downstream analyses are available from GitHub (github.com/ccb-hms/BioPlexAnalysis).

Assuntos

Proteoma , Software , Humanos , Proteômica , Mapas de Interação de Proteínas , Transcriptoma

3.

Disease risk scores for skin cancers.

Fontanillas, Pierre; Alipanahi, Babak; Furlotte, Nicholas A; Johnson, Michaela; Wilson, Catherine H; Pitts, Steven J; Gentleman, Robert; Auton, Adam.

Nat Commun ; 12(1): 160, 2021 01 08.

Artigo em Inglês | MEDLINE | ID: mdl-33420020

RESUMO

We trained and validated risk prediction models for the three major types of skin cancer- basal cell carcinoma (BCC), squamous cell carcinoma (SCC), and melanoma-on a cross-sectional and longitudinal dataset of 210,000 consented research participants who responded to an online survey covering personal and family history of skin cancer, skin susceptibility, and UV exposure. We developed a primary disease risk score (DRS) that combined all 32 identified genetic and non-genetic risk factors. Top percentile DRS was associated with an up to 13-fold increase (odds ratio per standard deviation increase >2.5) in the risk of developing skin cancer relative to the middle DRS percentile. To derive lifetime risk trajectories for the three skin cancers, we developed a second and age independent disease score, called DRSA. Using incident cases, we demonstrated that DRSA could be used in early detection programs for identifying high risk asymptotic individuals, and predicting when they are likely to develop skin cancer. High DRSA scores were not only associated with earlier disease diagnosis (by up to 14 years), but also with more severe and recurrent forms of skin cancer.

Assuntos

Carcinoma Basocelular/epidemiologia , Carcinoma de Células Escamosas/epidemiologia , Melanoma/epidemiologia , Modelos Estatísticos , Recidiva Local de Neoplasia/epidemiologia , Neoplasias Cutâneas/epidemiologia , Adulto , Idoso , Idoso de 80 Anos ou mais , Carcinoma Basocelular/etiologia , Carcinoma Basocelular/patologia , Carcinoma de Células Escamosas/etiologia , Estudos Transversais , Conjuntos de Dados como Assunto , Triagem e Testes Direto ao Consumidor/estatística & dados numéricos , Feminino , Seguimentos , Predisposição Genética para Doença , Estudo de Associação Genômica Ampla , Humanos , Incidência , Estudos Longitudinais , Masculino , Anamnese , Melanoma/etiologia , Melanoma/patologia , Pessoa de Meia-Idade , Recidiva Local de Neoplasia/etiologia , Recidiva Local de Neoplasia/patologia , Razão de Chances , Estudos Prospectivos , Medição de Risco/métodos , Fatores de Risco , Pele/patologia , Pele/efeitos da radiação , Neoplasias Cutâneas/etiologia , Neoplasias Cutâneas/patologia , Inquéritos e Questionários/estatística & dados numéricos , Raios Ultravioleta/efeitos adversos , População Branca/genética

4.

Demographic, spatial and temporal dietary intake patterns among 526 774 23andMe research participants.

Shelton, Janie F; Cameron, Briana; Aslibekyan, Stella; Gentleman, Robert.

Public Health Nutr ; 24(10): 2952-2963, 2021 07.

Artigo em Inglês | MEDLINE | ID: mdl-32597744

RESUMO

OBJECTIVE: To characterise dietary habits, their temporal and spatial patterns and associations with BMI in the 23andMe study population. DESIGN: We present a large-scale cross-sectional analysis of self-reported dietary intake data derived from the web-based National Health and Nutrition Examination Survey 2009-2010 dietary screener. Survey-weighted estimates for each food item were characterised by age, sex, race/ethnicity, education and BMI. Temporal patterns were plotted over a 2-year time period, and average consumption for select food items was mapped by state. Finally, dietary intake variables were tested for association with BMI. SETTING: US-based adults 20-85 years of age participating in the 23andMe research programme. PARTICIPANTS: Participants were 23andMe customers who consented to participate in research (n 526 774) and completed web-based surveys on demographic and dietary habits. RESULTS: Survey-weighted estimates show very few participants met federal recommendations for fruit: 2·6 %, vegetables: 5·9 % and dairy intake: 2·8 %. Between 2017 and 2019, fruit, vegetables and milk intake frequency declined, while total dairy remained stable and added sugars increased. Seasonal patterns in reporting were most pronounced for ice cream, chocolate, fruits and vegetables. Dietary habits varied across the USA, with higher intake of sugar and energy dense foods characterising areas with higher average BMI. In multivariate-adjusted models, BMI was directly associated with the intake of processed meat, red meat, dairy and inversely associated with consumption of fruit, vegetables and whole grains. CONCLUSIONS: 23andMe research participants have created an opportunity for rapid, large-scale, real-time nutritional data collection, informing demographic, seasonal and spatial patterns with broad geographical coverage across the USA.

Assuntos

Dieta , Verduras , Adulto , Estudos Transversais , Demografia , Ingestão de Alimentos , Ingestão de Energia , Comportamento Alimentar , Frutas , Humanos , Inquéritos Nutricionais

5.

Addressing the accuracy of direct-to-consumer genetic testing.

Wu, Shirley; Pollard, Jeffrey; Chowdry, Arnab; Scheller, Richard; Gentleman, Robert.

Genet Med ; 21(3): 758-759, 2019 03.

Artigo em Inglês | MEDLINE | ID: mdl-29955106

Assuntos

Triagem e Testes Direto ao Consumidor , Testes Genéticos , Humanos , Assistência ao Paciente

6.

VariantTools: an extensible framework for developing and testing variant callers.

Lawrence, Michael; Gentleman, Robert.

Bioinformatics ; 33(20): 3311-3313, 2017 Oct 15.

Artigo em Inglês | MEDLINE | ID: mdl-29028267

RESUMO

MOTIVATION: Variant calling is the complex task of separating real polymorphisms from errors. The appropriate strategy will depend on characteristics of the sample, the sequencing methodology and on the questions of interest. RESULTS: We present VariantTools, an extensible framework for developing and testing variant callers. There are facilities for reproducibly tallying, filtering, flagging and annotating variants. The tools are extensible, modular and flexible, so that they are tunable to particular use cases, and they interoperate with existing analysis software so that they can be embedded in established work flows. AVAILABILITY AND IMPLEMENTATION: VariantTools is available from http://www.bioconductor.org/. CONTACT: michafla@gene.com. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Assuntos

Técnicas de Genotipagem/métodos , Polimorfismo Genético , Análise de Sequência de DNA/métodos , Software , Genômica/métodos

7.

Creating a data resource: what will it take to build a medical information commons?

Deverka, Patricia A; Majumder, Mary A; Villanueva, Angela G; Anderson, Margaret; Bakker, Annette C; Bardill, Jessica; Boerwinkle, Eric; Bubela, Tania; Evans, Barbara J; Garrison, Nanibaa' A; Gibbs, Richard A; Gentleman, Robert; Glazer, David; Goldstein, Melissa M; Greely, Hank; Harris, Crane; Knoppers, Bartha M; Koenig, Barbara A; Kohane, Isaac S; La Rosa, Salvatore; Mattison, John; O'Donnell, Christopher J; Rai, Arti K; Rehm, Heidi L; Rodriguez, Laura L; Shelton, Robert; Simoncelli, Tania; Terry, Sharon F; Watson, Michael S; Wilbanks, John; Cook-Deegan, Robert; McGuire, Amy L.

Genome Med ; 9(1): 84, 2017 09 22.

Artigo em Inglês | MEDLINE | ID: mdl-28938910

RESUMO

National and international public-private partnerships, consortia, and government initiatives are underway to collect and share genomic, personal, and healthcare data on a massive scale. Ideally, these efforts will contribute to the creation of a medical information commons (MIC), a comprehensive data resource that is widely available for both research and clinical uses. Stakeholder participation is essential in clarifying goals, deepening understanding of areas of complexity, and addressing long-standing policy concerns such as privacy and security and data ownership. This article describes eight core principles proposed by a diverse group of expert stakeholders to guide the formation of a successful, sustainable MIC. These principles promote formation of an ethically sound, inclusive, participant-centric MIC and provide a framework for advancing the policy response to data-sharing opportunities and challenges.

Assuntos

Disseminação de Informação , Informática Médica , Humanos , Serviços de Informação , Informática Médica/ética

8.

Recurrent Loss of NFE2L2 Exon 2 Is a Mechanism for Nrf2 Pathway Activation in Human Cancers.

Goldstein, Leonard D; Lee, James; Gnad, Florian; Klijn, Christiaan; Schaub, Annalisa; Reeder, Jens; Daemen, Anneleen; Bakalarski, Corey E; Holcomb, Thomas; Shames, David S; Hartmaier, Ryan J; Chmielecki, Juliann; Seshagiri, Somasekar; Gentleman, Robert; Stokoe, David.

Cell Rep ; 16(10): 2605-2617, 2016 09 06.

Artigo em Inglês | MEDLINE | ID: mdl-27568559

RESUMO

The Nrf2 pathway is frequently activated in human cancers through mutations in Nrf2 or its negative regulator KEAP1. Using a cell-line-derived gene signature for Nrf2 pathway activation, we found that some tumors show high Nrf2 activity in the absence of known mutations in the pathway. An analysis of splice variants in oncogenes revealed that such tumors express abnormal transcript variants from the NFE2L2 gene (encoding Nrf2) that lack exon 2, or exons 2 and 3, and encode Nrf2 protein isoforms missing the KEAP1 interaction domain. The Nrf2 alterations result in the loss of interaction with KEAP1, Nrf2 stabilization, induction of a Nrf2 transcriptional response, and Nrf2 pathway dependence. In all analyzed cases, transcript variants were the result of heterozygous genomic microdeletions. Thus, we identify an alternative mechanism for Nrf2 pathway activation in human tumors and elucidate its functional consequences.

Assuntos

Éxons/genética , Mutação/genética , Fator 2 Relacionado a NF-E2/genética , Neoplasias/genética , Transdução de Sinais , Linhagem Celular Tumoral , Sobrevivência Celular/genética , Perfilação da Expressão Gênica , Regulação Neoplásica da Expressão Gênica , Genoma Humano , Humanos , Proteína 1 Associada a ECH Semelhante a Kelch/genética , Ligação Proteica , RNA Mensageiro/genética , RNA Mensageiro/metabolismo , Reprodutibilidade dos Testes , Deleção de Sequência/genética

9.

Prediction and Quantification of Splice Events from RNA-Seq Data.

Goldstein, Leonard D; Cao, Yi; Pau, Gregoire; Lawrence, Michael; Wu, Thomas D; Seshagiri, Somasekar; Gentleman, Robert.

PLoS One ; 11(5): e0156132, 2016.

Artigo em Inglês | MEDLINE | ID: mdl-27218464

RESUMO

Analysis of splice variants from short read RNA-seq data remains a challenging problem. Here we present a novel method for the genome-guided prediction and quantification of splice events from RNA-seq data, which enables the analysis of unannotated and complex splice events. Splice junctions and exons are predicted from reads mapped to a reference genome and are assembled into a genome-wide splice graph. Splice events are identified recursively from the graph and are quantified locally based on reads extending across the start or end of each splice variant. We assess prediction accuracy based on simulated and real RNA-seq data, and illustrate how different read aligners (GSNAP, HISAT2, STAR, TopHat2) affect prediction results. We validate our approach for quantification based on simulated data, and compare local estimates of relative splice variant usage with those from other methods (MISO, Cufflinks) based on simulated and real RNA-seq data. In a proof-of-concept study of splice variants in 16 normal human tissues (Illumina Body Map 2.0) we identify 249 internal exons that belong to known genes but are not related to annotated exons. Using independent RNA samples from 14 matched normal human tissues, we validate 9/9 of these exons by RT-PCR and 216/249 by paired-end RNA-seq (2 x 250 bp). These results indicate that de novo prediction of splice variants remains beneficial even in well-studied systems. An implementation of our method is freely available as an R/Bioconductor package [Formula: see text].

Assuntos

Biologia Computacional/métodos , Splicing de RNA , RNA/genética , Análise de Sequência de RNA/métodos , Algoritmos , Processamento Alternativo , Éxons , Humanos , Software

10.

Complex regulation of ADAR-mediated RNA-editing across tissues.

Huntley, Melanie A; Lou, Melanie; Goldstein, Leonard D; Lawrence, Michael; Dijkgraaf, Gerrit J P; Kaminker, Joshua S; Gentleman, Robert.

BMC Genomics ; 17: 61, 2016 Jan 15.

Artigo em Inglês | MEDLINE | ID: mdl-26768488

RESUMO

BACKGROUND: RNA-editing is a tightly regulated, and essential cellular process for a properly functioning brain. Dysfunction of A-to-I RNA editing can have catastrophic effects, particularly in the central nervous system. Thus, understanding how the process of RNA-editing is regulated has important implications for human health. However, at present, very little is known about the regulation of editing across tissues, and individuals. RESULTS: Here we present an analysis of RNA-editing patterns from 9 different tissues harvested from a single mouse. For comparison, we also analyzed data for 5 of these tissues harvested from 15 additional animals. We find that tissue specificity of editing largely reflects differential expression of substrate transcripts across tissues. We identified a surprising enrichment of editing in intronic regions of brain transcripts, that could account for previously reported higher levels of editing in brain. There exists a small but remarkable amount of editing which is tissue-specific, despite comparable expression levels of the edit site across multiple tissues. Expression levels of editing enzymes and their isoforms can explain some, but not all of this variation. CONCLUSIONS: Together, these data suggest a complex regulation of the RNA-editing process beyond transcript expression levels.

Assuntos

Adenosina Desaminase/genética , Especificidade de Órgãos/genética , Edição de RNA/genética , Proteínas de Ligação a RNA/genética , Adenosina Desaminase/biossíntese , Animais , Encéfalo/crescimento & desenvolvimento , Encéfalo/metabolismo , Regulação da Expressão Gênica , Humanos , Íntrons/genética , Camundongos , Isoformas de Proteínas/genética , Proteínas de Ligação a RNA/biossíntese , Transcrição Gênica

11.

Orchestrating high-throughput genomic analysis with Bioconductor.

Huber, Wolfgang; Carey, Vincent J; Gentleman, Robert; Anders, Simon; Carlson, Marc; Carvalho, Benilton S; Bravo, Hector Corrada; Davis, Sean; Gatto, Laurent; Girke, Thomas; Gottardo, Raphael; Hahne, Florian; Hansen, Kasper D; Irizarry, Rafael A; Lawrence, Michael; Love, Michael I; MacDonald, James; Obenchain, Valerie; Oles, Andrzej K; Pagès, Hervé; Reyes, Alejandro; Shannon, Paul; Smyth, Gordon K; Tenenbaum, Dan; Waldron, Levi; Morgan, Martin.

Nat Methods ; 12(2): 115-21, 2015 Feb.

Artigo em Inglês | MEDLINE | ID: mdl-25633503

RESUMO

Bioconductor is an open-source, open-development software project for the analysis and comprehension of high-throughput data in genomics and molecular biology. The project aims to enable interdisciplinary research, collaboration and rapid development of scientific software. Based on the statistical programming language R, Bioconductor comprises 934 interoperable packages contributed by a large, diverse community of scientists. Packages cover a range of bioinformatic and statistical applications. They undergo formal initial review and continuous automated testing. We present an overview for prospective users and contributors.

Assuntos

Biologia Computacional , Perfilação da Expressão Gênica , Genômica/métodos , Ensaios de Triagem em Larga Escala/métodos , Software , Linguagens de Programação , Interface Usuário-Computador

12.

A comprehensive transcriptional portrait of human cancer cell lines.

Klijn, Christiaan; Durinck, Steffen; Stawiski, Eric W; Haverty, Peter M; Jiang, Zhaoshi; Liu, Hanbin; Degenhardt, Jeremiah; Mayba, Oleg; Gnad, Florian; Liu, Jinfeng; Pau, Gregoire; Reeder, Jens; Cao, Yi; Mukhyala, Kiran; Selvaraj, Suresh K; Yu, Mamie; Zynda, Gregory J; Brauer, Matthew J; Wu, Thomas D; Gentleman, Robert C; Manning, Gerard; Yauch, Robert L; Bourgon, Richard; Stokoe, David; Modrusan, Zora; Neve, Richard M; de Sauvage, Frederic J; Settleman, Jeffrey; Seshagiri, Somasekar; Zhang, Zemin.

Nat Biotechnol ; 33(3): 306-12, 2015 Mar.

Artigo em Inglês | MEDLINE | ID: mdl-25485619

RESUMO

Tumor-derived cell lines have served as vital models to advance our understanding of oncogene function and therapeutic responses. Although substantial effort has been made to define the genomic constitution of cancer cell line panels, the transcriptome remains understudied. Here we describe RNA sequencing and single-nucleotide polymorphism (SNP) array analysis of 675 human cancer cell lines. We report comprehensive analyses of transcriptome features including gene expression, mutations, gene fusions and expression of non-human sequences. Of the 2,200 gene fusions catalogued, 1,435 consist of genes not previously found in fusions, providing many leads for further investigation. We combine multiple genome and transcriptome features in a pathway-based approach to enhance prediction of response to targeted therapeutics. Our results provide a valuable resource for studies that use cancer cell lines.

Assuntos

Neoplasias/genética , Transcrição Gênica , Sequência de Bases , Linhagem Celular Tumoral , Análise por Conglomerados , Regulação Neoplásica da Expressão Gênica , Redes Reguladoras de Genes , Humanos , Mutação/genética , Fusão Oncogênica/genética , Especificidade de Órgãos/genética , Polimorfismo de Nucleotídeo Único/genética

13.

Discriminative motif analysis of high-throughput dataset.

Yao, Zizhen; Macquarrie, Kyle L; Fong, Abraham P; Tapscott, Stephen J; Ruzzo, Walter L; Gentleman, Robert C.

Bioinformatics ; 30(6): 775-83, 2014 Mar 15.

Artigo em Inglês | MEDLINE | ID: mdl-24162561

RESUMO

MOTIVATION: High-throughput ChIP-seq studies typically identify thousands of peaks for a single transcription factor (TF). It is common for traditional motif discovery tools to predict motifs that are statistically significant against a naïve background distribution but are of questionable biological relevance. RESULTS: We describe a simple yet effective algorithm for discovering differential motifs between two sequence datasets that is effective in eliminating systematic biases and scalable to large datasets. Tested on 207 ENCODE ChIP-seq datasets, our method identifies correct motifs in 78% of the datasets with known motifs, demonstrating improvement in both accuracy and efficiency compared with DREME, another state-of-art discriminative motif discovery tool. More interestingly, on the remaining more challenging datasets, we identify common technical or biological factors that compromise the motif search results and use advanced features of our tool to control for these factors. We also present case studies demonstrating the ability of our method to detect single base pair differences in DNA specificity of two similar TFs. Lastly, we demonstrate discovery of key TF motifs involved in tissue specification by examination of high-throughput DNase accessibility data. AVAILABILITY: The motifRG package is publically available via the bioconductor repository. CONTACT: yzizhen@fhcrc.org SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Assuntos

Imunoprecipitação da Cromatina/métodos , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Análise de Sequência de DNA/métodos , Algoritmos , Sequência de Bases , DNA/genética , Humanos , Fatores de Transcrição/genética

14.

gCMAP: user-friendly connectivity mapping with R.

Sandmann, Thomas; Kummerfeld, Sarah K; Gentleman, Robert; Bourgon, Richard.

Bioinformatics ; 30(1): 127-8, 2014 Jan 01.

Artigo em Inglês | MEDLINE | ID: mdl-24132929

RESUMO

UNLABELLED: Connections between disease phenotypes and drug effects can be made by identifying commonalities in the associated patterns of differential gene expression. Searchable databases that record the impacts of chemical or genetic perturbations on the transcriptome--here referred to as 'connectivity maps'--permit discovery of such commonalities. We describe two R packages, gCMAP and gCMAPWeb, which provide a complete framework to construct and query connectivity maps assembled from user-defined collections of differential gene expression data. Microarray or RNAseq data are processed in a standardized way, and results can be interrogated using various well-established gene set enrichment methods. The packages also feature an easy-to-deploy web application that facilitates reproducible research through automatic generation of graphical and tabular reports. AVAILABILITY AND IMPLEMENTATION: The gCMAP and gCMAPWeb R packages are freely available for UNIX, Windows and Mac OS X operating systems at Bioconductor (http://www.bioconductor.org).

Assuntos

Análise de Sequência com Séries de Oligonucleotídeos/métodos , Interface Usuário-Computador , Animais , Linhagem Celular , Perfilação da Expressão Gênica/métodos , Humanos , Internet

15.

Integrative analysis of two cell lines derived from a non-small-lung cancer patient--a panomics approach.

Mayba, Oleg; Gnad, Florian; Peyton, Michael; Zhang, Fan; Walter, Kimberly; Du, Pan; Huntley, Melanie A; Jiang, Zhaoshi; Liu, Jinfeng; Haverty, Peter M; Gentleman, Robert C; Li, Ruiqiang; Minna, John D; Li, Yingrui; Shames, David S; Zhang, Zemin.

Pac Symp Biocomput ; : 75-86, 2014.

Artigo em Inglês | MEDLINE | ID: mdl-24297535

RESUMO

Cancer cells derived from different stages of tumor progression may exhibit distinct biological properties, as exemplified by the paired lung cancer cell lines H1993 and H2073. While H1993 was derived from chemo-naive metastasized tumor, H2073 originated from the chemo-resistant primary tumor from the same patient and exhibits strikingly different drug response profile. To understand the underlying genetic and epigenetic bases for their biological properties, we investigated these cells using a wide range of large-scale methods including whole genome sequencing, RNA sequencing, SNP array, DNA methylation array, and de novo genome assembly. We conducted an integrative analysis of both cell lines to distinguish between potential driver and passenger alterations. Although many genes are mutated in these cell lines, the combination of DNA- and RNA-based variant information strongly implicates a small number of genes including TP53 and STK11 as likely drivers. Likewise, we found a diverse set of genes differentially expressed between these cell lines, but only a fraction can be attributed to changes in DNA copy number or methylation. This set included the ABC transporter ABCC4, implicated in drug resistance, and the metastasis associated MET oncogene. While the rich data content allowed us to reduce the space of hypotheses that could explain most of the observed biological properties, we also caution there is a lack of statistical power and inherent limitations in such single patient case studies.

Assuntos

Carcinoma Pulmonar de Células não Pequenas/genética , Neoplasias Pulmonares/genética , Carcinoma Pulmonar de Células não Pequenas/tratamento farmacológico , Carcinoma Pulmonar de Células não Pequenas/metabolismo , Linhagem Celular Tumoral , Biologia Computacional , Metilação de DNA , Resistencia a Medicamentos Antineoplásicos/genética , Epigênese Genética , Dosagem de Genes , Perfilação da Expressão Gênica/estatística & dados numéricos , Genômica/estatística & dados numéricos , Humanos , Neoplasias Pulmonares/tratamento farmacológico , Neoplasias Pulmonares/metabolismo , Modelos Genéticos , Mutação

16.

The anatomy of successful computational biology software.

Altschul, Stephen; Demchak, Barry; Durbin, Richard; Gentleman, Robert; Krzywinski, Martin; Li, Heng; Nekrutenko, Anton; Robinson, James; Rasband, Wayne; Taylor, James; Trapnell, Cole.

Nat Biotechnol ; 31(10): 894-7, 2013 Oct.

Artigo em Inglês | MEDLINE | ID: mdl-24104757

Assuntos

Biologia Computacional , Software

17.

Software for computing and annotating genomic ranges.

Lawrence, Michael; Huber, Wolfgang; Pagès, Hervé; Aboyoun, Patrick; Carlson, Marc; Gentleman, Robert; Morgan, Martin T; Carey, Vincent J.

PLoS Comput Biol ; 9(8): e1003118, 2013.

Artigo em Inglês | MEDLINE | ID: mdl-23950696

RESUMO

We describe Bioconductor infrastructure for representing and computing on annotated genomic ranges and integrating genomic data with the statistical computing features of R and its extensions. At the core of the infrastructure are three packages: IRanges, GenomicRanges, and GenomicFeatures. These packages provide scalable data structures for representing annotated ranges on the genome, with special support for transcript structures, read alignments and coverage vectors. Computational facilities include efficient algorithms for overlap and nearest neighbor detection, coverage calculation and other range operations. This infrastructure directly supports more than 80 other Bioconductor packages, including those for sequence analysis, differential expression analysis and visualization.

Assuntos

Bases de Dados Genéticas , Genômica/métodos , Software , Algoritmos , Animais , Genômica/normas , Humanos , Camundongos , Alinhamento de Sequência , Análise de Sequência de DNA

18.

Comparison of endogenous and overexpressed MyoD shows enhanced binding of physiologically bound sites.

Yao, Zizhen; Fong, Abraham P; Cao, Yi; Ruzzo, Walter L; Gentleman, Robert C; Tapscott, Stephen J.

Skelet Muscle ; 3(1): 8, 2013 Apr 08.

Artigo em Inglês | MEDLINE | ID: mdl-23566431

RESUMO

BACKGROUND: Transcription factor overexpression is common in biological experiments and transcription factor amplification is associated with many cancers, yet few studies have directly compared the DNA-binding profiles of endogenous versus overexpressed transcription factors. METHODS: We analyzed MyoD ChIP-seq data from C2C12 mouse myotubes, primary mouse myotubes, and mouse fibroblasts differentiated into muscle cells by overexpression of MyoD and compared the genome-wide binding profiles and binding site characteristics of endogenous and overexpressed MyoD. RESULTS: Overexpressed MyoD bound to the same sites occupied by endogenous MyoD and possessed the same E-box sequence preference and co-factor site enrichments, and did not bind to new sites with distinct characteristics. CONCLUSIONS: Our data demonstrate a robust fidelity of transcription factor binding sites over a range of expression levels and that increased amounts of transcription factor increase the binding at physiologically bound sites.

19.

Genome and transcriptome sequencing of lung cancers reveal diverse mutational and splicing events.

Liu, Jinfeng; Lee, William; Jiang, Zhaoshi; Chen, Zhongqiang; Jhunjhunwala, Suchit; Haverty, Peter M; Gnad, Florian; Guan, Yinghui; Gilbert, Houston N; Stinson, Jeremy; Klijn, Christiaan; Guillory, Joseph; Bhatt, Deepali; Vartanian, Steffan; Walter, Kimberly; Chan, Jocelyn; Holcomb, Thomas; Dijkgraaf, Peter; Johnson, Stephanie; Koeman, Julie; Minna, John D; Gazdar, Adi F; Stern, Howard M; Hoeflich, Klaus P; Wu, Thomas D; Settleman, Jeff; de Sauvage, Frederic J; Gentleman, Robert C; Neve, Richard M; Stokoe, David; Modrusan, Zora; Seshagiri, Somasekar; Shames, David S; Zhang, Zemin.

Genome Res ; 22(12): 2315-27, 2012 Dec.

Artigo em Inglês | MEDLINE | ID: mdl-23033341

RESUMO

Lung cancer is a highly heterogeneous disease in terms of both underlying genetic lesions and response to therapeutic treatments. We performed deep whole-genome sequencing and transcriptome sequencing on 19 lung cancer cell lines and three lung tumor/normal pairs. Overall, our data show that cell line models exhibit similar mutation spectra to human tumor samples. Smoker and never-smoker cancer samples exhibit distinguishable patterns of mutations. A number of epigenetic regulators, including KDM6A, ASH1L, SMARCA4, and ATAD2, are frequently altered by mutations or copy number changes. A systematic survey of splice-site mutations identified 106 splice site mutations associated with cancer specific aberrant splicing, including mutations in several known cancer-related genes. RAC1b, an isoform of the RAC1 GTPase that includes one additional exon, was found to be preferentially up-regulated in lung cancer. We further show that its expression is significantly associated with sensitivity to a MAP2K (MEK) inhibitor PD-0325901. Taken together, these data present a comprehensive genomic landscape of a large number of lung cancer samples and further demonstrate that cancer-specific alternative splicing is a widespread phenomenon that has potential utility as therapeutic biomarkers. The detailed characterizations of the lung cancer cell lines also provide genomic context to the vast amount of experimental data gathered for these lines over the decades, and represent highly valuable resources for cancer biology.

Assuntos

Processamento Alternativo , Regulação Neoplásica da Expressão Gênica , Genoma Humano/genética , Neoplasias Pulmonares/genética , Mutação , Transcriptoma , ATPases Associadas a Diversas Atividades Celulares , Adenosina Trifosfatases/genética , Adenosina Trifosfatases/metabolismo , Linhagem Celular Tumoral , Variações do Número de Cópias de DNA , DNA Helicases/genética , DNA Helicases/metabolismo , Proteínas de Ligação a DNA/genética , Proteínas de Ligação a DNA/metabolismo , Epigenômica , Éxons , Marcadores Genéticos , Heterozigoto , Histona Desmetilases/genética , Histona Desmetilases/metabolismo , Histona-Lisina N-Metiltransferase , Humanos , Cariotipagem/métodos , Neoplasias Pulmonares/patologia , Proteínas Nucleares/genética , Proteínas Nucleares/metabolismo , Polimorfismo de Nucleotídeo Único , Reprodutibilidade dos Testes , Análise de Sequência de RNA , Fatores de Transcrição/genética , Fatores de Transcrição/metabolismo , Regulação para Cima , Proteínas rac1 de Ligação ao GTP/genética , Proteínas rac1 de Ligação ao GTP/metabolismo

20.

Comprehensive genomic analysis identifies SOX2 as a frequently amplified gene in small-cell lung cancer.

Rudin, Charles M; Durinck, Steffen; Stawiski, Eric W; Poirier, John T; Modrusan, Zora; Shames, David S; Bergbower, Emily A; Guan, Yinghui; Shin, James; Guillory, Joseph; Rivers, Celina Sanchez; Foo, Catherine K; Bhatt, Deepali; Stinson, Jeremy; Gnad, Florian; Haverty, Peter M; Gentleman, Robert; Chaudhuri, Subhra; Janakiraman, Vasantharajan; Jaiswal, Bijay S; Parikh, Chaitali; Yuan, Wenlin; Zhang, Zemin; Koeppen, Hartmut; Wu, Thomas D; Stern, Howard M; Yauch, Robert L; Huffman, Kenneth E; Paskulin, Diego D; Illei, Peter B; Varella-Garcia, Marileila; Gazdar, Adi F; de Sauvage, Frederic J; Bourgon, Richard; Minna, John D; Brock, Malcolm V; Seshagiri, Somasekar.

Nat Genet ; 44(10): 1111-6, 2012 Oct.

Artigo em Inglês | MEDLINE | ID: mdl-22941189

RESUMO

Small-cell lung cancer (SCLC) is an exceptionally aggressive disease with poor prognosis. Here, we obtained exome, transcriptome and copy-number alteration data from approximately 53 samples consisting of 36 primary human SCLC and normal tissue pairs and 17 matched SCLC and lymphoblastoid cell lines. We also obtained data for 4 primary tumors and 23 SCLC cell lines. We identified 22 significantly mutated genes in SCLC, including genes encoding kinases, G protein-coupled receptors and chromatin-modifying proteins. We found that several members of the SOX family of genes were mutated in SCLC. We also found SOX2 amplification in â¼27% of the samples. Suppression of SOX2 using shRNAs blocked proliferation of SOX2-amplified SCLC lines. RNA sequencing identified multiple fusion transcripts and a recurrent RLF-MYCL1 fusion. Silencing of MYCL1 in SCLC cell lines that had the RLF-MYCL1 fusion decreased cell proliferation. These data provide an in-depth view of the spectrum of genomic alterations in SCLC and identify several potential targets for therapeutic intervention.

Assuntos

Amplificação de Genes , Neoplasias Pulmonares/genética , Fatores de Transcrição SOXB1/genética , Carcinoma de Pequenas Células do Pulmão/genética , Sequência de Bases , Linhagem Celular Tumoral , Variações do Número de Cópias de DNA , Análise Mutacional de DNA , Exoma , Expressão Gênica , Estudo de Associação Genômica Ampla , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Neoplasias Pulmonares/metabolismo , Dados de Sequência Molecular , Mutação , Proteínas de Fusão Oncogênica/genética , Proteínas Quinases/genética , Fatores de Transcrição SOXB1/metabolismo , Carcinoma de Pequenas Células do Pulmão/metabolismo

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA