Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 28
Filter
Add more filters










Publication year range
1.
Materials (Basel) ; 16(7)2023 Mar 23.
Article in English | MEDLINE | ID: mdl-37048864

ABSTRACT

Nonlinear unloading plays an important role in predicting springback during plastic forming process. To improve the accuracy of springback prediction which could provide a guide for precision forming, uniaxial tensile tests and uniaxial loading-unloading-loading tensile tests on SUS304 stainless steel were carried out. The flow stress mathematical model and chord modulus mathematical model were calibrated according to the test results. A constant elastic modulus three-point bending finite element model (E0FEMB) and a constant elastic modulus roll forming finite element model (E0FEMR) were established in MSC.MARC. The chord modulus was output by the PLOTV subroutine to determine the mean modulus of different regions, and the mean modulus three-point bending finite element model (E¯cFEMB) and the mean modulus roll forming finite element model (E¯cFEMR) were defined. The constant modulus finite element model (E0FEM) simulation results and the mean modulus finite element model (E¯cFEM) simulation results were compared with the three-point bending tests and roll forming tests test results. The difference between the simulation results and the test results was small, indicating that the mean modulus was feasible to predict the springback, which verified the suitability of the E¯cFEM.

2.
Genome Biol ; 22(1): 111, 2021 04 16.
Article in English | MEDLINE | ID: mdl-33863366

ABSTRACT

BACKGROUND: Oncopanel genomic testing, which identifies important somatic variants, is increasingly common in medical practice and especially in clinical trials. Currently, there is a paucity of reliable genomic reference samples having a suitably large number of pre-identified variants for properly assessing oncopanel assay analytical quality and performance. The FDA-led Sequencing and Quality Control Phase 2 (SEQC2) consortium analyze ten diverse cancer cell lines individually and their pool, termed Sample A, to develop a reference sample with suitably large numbers of coding positions with known (variant) positives and negatives for properly evaluating oncopanel analytical performance. RESULTS: In reference Sample A, we identify more than 40,000 variants down to 1% allele frequency with more than 25,000 variants having less than 20% allele frequency with 1653 variants in COSMIC-related genes. This is 5-100× more than existing commercially available samples. We also identify an unprecedented number of negative positions in coding regions, allowing statistical rigor in assessing limit-of-detection, sensitivity, and precision. Over 300 loci are randomly selected and independently verified via droplet digital PCR with 100% concordance. Agilent normal reference Sample B can be admixed with Sample A to create new samples with a similar number of known variants at much lower allele frequency than what exists in Sample A natively, including known variants having allele frequency of 0.02%, a range suitable for assessing liquid biopsy panels. CONCLUSION: These new reference samples and their admixtures provide superior capability for performing oncopanel quality control, analytical accuracy, and validation for small to large oncopanels and liquid biopsy assays.


Subject(s)
Alleles , Biomarkers, Tumor , Gene Frequency , Genetic Testing/methods , Genetic Variation , Genomics/methods , Neoplasms/genetics , Cell Line, Tumor , DNA Copy Number Variations , Genetic Heterogeneity , Genetic Testing/standards , Genomics/standards , Humans , Neoplasms/diagnosis , Workflow
3.
Sci Data ; 3: 160025, 2016 Jun 07.
Article in English | MEDLINE | ID: mdl-27271295

ABSTRACT

The Genome in a Bottle Consortium, hosted by the National Institute of Standards and Technology (NIST) is creating reference materials and data for human genome sequencing, as well as methods for genome comparison and benchmarking. Here, we describe a large, diverse set of sequencing data for seven human genomes; five are current or candidate NIST Reference Materials. The pilot genome, NA12878, has been released as NIST RM 8398. We also describe data from two Personal Genome Project trios, one of Ashkenazim Jewish ancestry and one of Chinese ancestry. The data come from 12 technologies: BioNano Genomics, Complete Genomics paired-end and LFR, Ion Proton exome, Oxford Nanopore, Pacific Biosciences, SOLiD, 10X Genomics GemCode WGS, and Illumina exome and WGS paired-end, mate-pair, and synthetic long reads. Cell lines, DNA, and data from these individuals are publicly available. Therefore, we expect these data to be useful for revealing novel information about the human genome and improving sequencing technologies, SNP, indel, and structural variant calling, and de novo assembly.


Subject(s)
Benchmarking , Genome, Human , Exome , Genomics , Humans , INDEL Mutation
4.
Nature ; 475(7356): 348-52, 2011 Jul 20.
Article in English | MEDLINE | ID: mdl-21776081

ABSTRACT

The seminal importance of DNA sequencing to the life sciences, biotechnology and medicine has driven the search for more scalable and lower-cost solutions. Here we describe a DNA sequencing technology in which scalable, low-cost semiconductor manufacturing techniques are used to make an integrated circuit able to directly perform non-optical DNA sequencing of genomes. Sequence data are obtained by directly sensing the ions produced by template-directed DNA polymerase synthesis using all-natural nucleotides on this massively parallel semiconductor-sensing device or ion chip. The ion chip contains ion-sensitive, field-effect transistor-based sensors in perfect register with 1.2 million wells, which provide confinement and allow parallel, simultaneous detection of independent sequencing reactions. Use of the most widely used technology for constructing integrated circuits, the complementary metal-oxide semiconductor (CMOS) process, allows for low-cost, large-scale production and scaling of the device to higher densities and larger array sizes. We show the performance of the system by sequencing three bacterial genomes, its robustness and scalability by producing ion chips with up to 10 times as many sensors and sequencing a human genome.


Subject(s)
Genome, Bacterial/genetics , Genome, Human/genetics , Genomics/instrumentation , Genomics/methods , Semiconductors , Sequence Analysis, DNA/instrumentation , Sequence Analysis, DNA/methods , Escherichia coli/genetics , Humans , Light , Male , Rhodopseudomonas/genetics , Vibrio/genetics
5.
PLoS One ; 6(7): e22250, 2011.
Article in English | MEDLINE | ID: mdl-21799804

ABSTRACT

Comprehensive identification of the acquired mutations that cause common cancers will require genomic analyses of large sets of tumor samples. Typically, the tissue material available from tumor specimens is limited, which creates a demand for accurate template amplification. We therefore evaluated whether phi29-mediated whole genome amplification introduces false positive structural mutations by massive mate-pair sequencing of a normal human genome before and after such amplification. Multiple displacement amplification led to a decrease in clone coverage and an increase by two orders of magnitude in the prevalence of inversions, but did not increase the prevalence of translocations. While multiple strand displacement amplification may find uses in translocation analyses, it is likely that alternative amplification strategies need to be developed to meet the demands of cancer genomics.


Subject(s)
Artifacts , Genome, Human/genetics , Mutation/genetics , Nucleic Acid Amplification Techniques/methods , Sequence Analysis, DNA , False Positive Reactions , Female , Gene Rearrangement/genetics , Humans
6.
Nature ; 470(7332): 59-65, 2011 Feb 03.
Article in English | MEDLINE | ID: mdl-21293372

ABSTRACT

Genomic structural variants (SVs) are abundant in humans, differing from other forms of variation in extent, origin and functional impact. Despite progress in SV characterization, the nucleotide resolution architecture of most SVs remains unknown. We constructed a map of unbalanced SVs (that is, copy number variants) based on whole genome DNA sequencing data from 185 human genomes, integrating evidence from complementary SV discovery approaches with extensive experimental validations. Our map encompassed 22,025 deletions and 6,000 additional SVs, including insertions and tandem duplications. Most SVs (53%) were mapped to nucleotide resolution, which facilitated analysing their origin and functional impact. We examined numerous whole and partial gene deletions with a genotyping approach and observed a depletion of gene disruptions amongst high frequency deletions. Furthermore, we observed differences in the size spectra of SVs originating from distinct formation mechanisms, and constructed a map of SV hotspots formed by common mechanisms. Our analytical framework and SV map serves as a resource for sequencing-based association studies.


Subject(s)
DNA Copy Number Variations/genetics , Genetics, Population , Genome, Human/genetics , Genomics , Gene Duplication/genetics , Genetic Predisposition to Disease/genetics , Genotype , Humans , Mutagenesis, Insertional/genetics , Reproducibility of Results , Sequence Analysis, DNA , Sequence Deletion/genetics
7.
Bioinformatics ; 27(8): 1152-4, 2011 Apr 15.
Article in English | MEDLINE | ID: mdl-21349863

ABSTRACT

UNLABELLED: We have implemented aggregation and correlation toolbox (ACT), an efficient, multifaceted toolbox for analyzing continuous signal and discrete region tracks from high-throughput genomic experiments, such as RNA-seq or ChIP-chip signal profiles from the ENCODE and modENCODE projects, or lists of single nucleotide polymorphisms from the 1000 genomes project. It is able to generate aggregate profiles of a given track around a set of specified anchor points, such as transcription start sites. It is also able to correlate related tracks and analyze them for saturation--i.e. how much of a certain feature is covered with each new succeeding experiment. The ACT site contains downloadable code in a variety of formats, interactive web servers (for use on small quantities of data), example datasets, documentation and a gallery of outputs. Here, we explain the components of the toolbox in more detail and apply them in various contexts. AVAILABILITY: ACT is available at http://act.gersteinlab.org CONTACT: pi@gersteinlab.org.


Subject(s)
Genomics/methods , Software , Polymorphism, Single Nucleotide , Transcription Initiation Site
8.
Genome Res ; 20(7): 972-80, 2010 Jul.
Article in English | MEDLINE | ID: mdl-20488932

ABSTRACT

Abnormalities of genomic methylation patterns are lethal or cause disease, but the cues that normally designate CpG dinucleotides for methylation are poorly understood. We have developed a new method of methylation profiling that has single-CpG resolution and can address the methylation status of repeated sequences. We have used this method to determine the methylation status of >275 million CpG sites in human and mouse DNA from breast and brain tissues. Methylation density at most sequences was found to increase linearly with CpG density and to fall sharply at very high CpG densities, but transposons remained densely methylated even at higher CpG densities. The presence of histone H2A.Z and histone H3 di- or trimethylated at lysine 4 correlated strongly with unmethylated DNA and occurred primarily at promoter regions. We conclude that methylation is the default state of most CpG dinucleotides in the mammalian genome and that a combination of local dinucleotide frequencies, the interaction of repeated sequences, and the presence or absence of histone variants or modifications shields a population of CpG sites (most of which are in and around promoters) from DNA methyltransferases that lack intrinsic sequence specificity.


Subject(s)
Base Sequence/physiology , Chromatin/chemistry , Chromatin/physiology , DNA Methylation , Animals , Brain/metabolism , Breast/metabolism , Chromatin/genetics , Chromosome Mapping , CpG Islands/genetics , Female , Genome , Histones/metabolism , Humans , Mice , Sequence Analysis, DNA , Validation Studies as Topic
9.
PLoS One ; 5(2): e9320, 2010 Feb 22.
Article in English | MEDLINE | ID: mdl-20179767

ABSTRACT

Methylation, the addition of methyl groups to cytosine (C), plays an important role in the regulation of gene expression in both normal and dysfunctional cells. During bisulfite conversion and subsequent PCR amplification, unmethylated Cs are converted into thymine (T), while methylated Cs will not be converted. Sequencing of this bisulfite-treated DNA permits the detection of methylation at specific sites. Through the introduction of next-generation sequencing technologies (NGS) simultaneous analysis of methylation motifs in multiple regions provides the opportunity for hypothesis-free study of the entire methylome. Here we present a whole methylome sequencing study that compares two different bisulfite conversion methods (in solution versus in gel), utilizing the high throughput of the SOLiD System. Advantages and disadvantages of the two different bisulfite conversion methods for constructing sequencing libraries are discussed. Furthermore, the application of the SOLiD bisulfite sequencing to larger and more complex genomes is shown with preliminary in silico created bisulfite converted reads.


Subject(s)
DNA Methylation , Genome, Human/genetics , Sequence Analysis, DNA/methods , Base Sequence , Binding Sites/genetics , DNA/chemistry , DNA/genetics , Electrophoresis, Polyacrylamide Gel/methods , Genomic Library , Humans , Molecular Sequence Data , Polymerase Chain Reaction , Sequence Homology, Nucleic Acid , Sulfites/chemistry
10.
Genome Res ; 19(9): 1527-41, 2009 Sep.
Article in English | MEDLINE | ID: mdl-19546169

ABSTRACT

We describe the genome sequencing of an anonymous individual of African origin using a novel ligation-based sequencing assay that enables a unique form of error correction that improves the raw accuracy of the aligned reads to >99.9%, allowing us to accurately call SNPs with as few as two reads per allele. We collected several billion mate-paired reads yielding approximately 18x haploid coverage of aligned sequence and close to 300x clone coverage. Over 98% of the reference genome is covered with at least one uniquely placed read, and 99.65% is spanned by at least one uniquely placed mate-paired clone. We identify over 3.8 million SNPs, 19% of which are novel. Mate-paired data are used to physically resolve haplotype phases of nearly two-thirds of the genotypes obtained and produce phased segments of up to 215 kb. We detect 226,529 intra-read indels, 5590 indels between mate-paired reads, 91 inversions, and four gene fusions. We use a novel approach for detecting indels between mate-paired reads that are smaller than the standard deviation of the insert size of the library and discover deletions in common with those detected with our intra-read approach. Dozens of mutations previously described in OMIM and hundreds of nonsynonymous single-nucleotide and structural variants in genes previously implicated in disease are identified in this individual. There is more genetic variation in the human genome still to be uncovered, and we provide guidance for future surveys in populations and cancer biopsies.


Subject(s)
Base Pairing , Computational Biology/methods , Genetic Variation , Genome, Human , Ligases , Sequence Analysis, DNA/methods , Africa , Base Sequence , Genomics , Genotype , Heterozygote , Homozygote , Humans , Polymorphism, Single Nucleotide , Reference Standards
11.
Physiol Genomics ; 37(3): 199-210, 2009 May 13.
Article in English | MEDLINE | ID: mdl-19258493

ABSTRACT

Caffeine is the most widely consumed psychoactive substance and has complex pharmacological actions in brain. In this study, we employed a novel drug target validation strategy to uncover the multiple molecular targets of caffeine using combined A(2A) receptor (A(2A)R) knockouts (KO) and microarray profiling. Caffeine (10 mg/kg) elicited a distinct profile of striatal gene expression in WT mice compared with that by A(2A)R gene deletion or by administering caffeine into A(2A)R KO mice. Thus, A(2A)Rs are required but not sufficient to elicit the striatal gene expression by caffeine (10 mg/kg). Caffeine (50 mg/kg) induced complex expression patterns with three distinct sets of striatal genes: 1) one subset overlapped with those elicited by genetic deletion of A(2A)Rs; 2) the second subset elicited by caffeine in WT as well as A(2A)R KO mice; and 3) the third subset elicited by caffeine only in A(2A)R KO mice. Furthermore, striatal gene sets elicited by the phosphodiesterase (PDE) inhibitor rolipram and the GABA(A) receptor antagonist bicucullin, overlapped with the distinct subsets of striatal genes elicited by caffeine (50 mg/kg) administered to A(2A)R KO mice. Finally, Gene Set Enrichment Analysis reveals that adipocyte differentiation/insulin signaling is highly enriched in the striatal gene sets elicited by both low and high doses of caffeine. The identification of these distinct striatal gene populations and their corresponding multiple molecular targets, including A(2A)R, non-A(2A)R (possibly A(1)Rs and pathways associated with PDE and GABA(A)R) and their interactions, and the cellular pathways affected by low and high doses of caffeine, provides molecular insights into the acute pharmacological effects of caffeine in the brain.


Subject(s)
Caffeine/pharmacology , Gene Expression Profiling , Oligonucleotide Array Sequence Analysis/methods , Receptor, Adenosine A2A/physiology , Animals , Bicuculline/pharmacology , Central Nervous System Stimulants/pharmacology , Cluster Analysis , Dose-Response Relationship, Drug , Female , GABA Antagonists/pharmacology , Gene Expression Regulation/drug effects , Male , Mice , Mice, Knockout , Neostriatum/drug effects , Neostriatum/metabolism , Phosphodiesterase Inhibitors/pharmacology , Receptor, Adenosine A2A/genetics , Reverse Transcriptase Polymerase Chain Reaction , Rolipram/pharmacology
12.
PLoS Genet ; 4(7): e1000138, 2008 Jul 25.
Article in English | MEDLINE | ID: mdl-18654629

ABSTRACT

Chromatin structure plays an important role in modulating the accessibility of genomic DNA to regulatory proteins in eukaryotic cells. We performed an integrative analysis on dozens of recent datasets generated by deep-sequencing and high-density tiling arrays, and we discovered an array of well-positioned nucleosomes flanking sites occupied by the insulator binding protein CTCF across the human genome. These nucleosomes are highly enriched for the histone variant H2A.Z and 11 histone modifications. The distances between the center positions of the neighboring nucleosomes are largely invariant, and we estimate them to be 185 bp on average. Surprisingly, subsets of nucleosomes that are enriched in different histone modifications vary greatly in the lengths of DNA protected from micrococcal nuclease cleavage (106-164 bp). The nucleosomes enriched in those histone modifications previously implicated to be correlated with active transcription tend to contain less protected DNA, indicating that these modifications are correlated with greater DNA accessibility. Another striking result obtained from our analysis is that nucleosomes flanking CTCF sites are much better positioned than those downstream of transcription start sites, the only genomic feature previously known to position nucleosomes genome-wide. This nucleosome-positioning phenomenon is not observed for other transcriptional factors for which we had genome-wide binding data. We suggest that binding of CTCF provides an anchor point for positioning nucleosomes, and chromatin remodeling is an important component of CTCF function.


Subject(s)
DNA-Binding Proteins/genetics , DNA-Binding Proteins/metabolism , Genome, Human , Nucleosomes/genetics , Nucleosomes/metabolism , Repressor Proteins/genetics , Repressor Proteins/metabolism , Binding Sites , CCCTC-Binding Factor , Chromatin Assembly and Disassembly/physiology , Histones/genetics , Histones/metabolism , Humans , Micrococcal Nuclease/pharmacology , Transcription Factors/metabolism
13.
Genome Res ; 18(3): 393-403, 2008 Mar.
Article in English | MEDLINE | ID: mdl-18258921

ABSTRACT

The most widely used method for detecting genome-wide protein-DNA interactions is chromatin immunoprecipitation on tiling microarrays, commonly known as ChIP-chip. Here, we conducted the first objective analysis of tiling array platforms, amplification procedures, and signal detection algorithms in a simulated ChIP-chip experiment. Mixtures of human genomic DNA and "spike-ins" comprised of nearly 100 human sequences at various concentrations were hybridized to four tiling array platforms by eight independent groups. Blind to the number of spike-ins, their locations, and the range of concentrations, each group made predictions of the spike-in locations. We found that microarray platform choice is not the primary determinant of overall performance. In fact, variation in performance between labs, protocols, and algorithms within the same array platform was greater than the variation in performance between array platforms. However, each array platform had unique performance characteristics that varied with tiling resolution and the number of replicates, which have implications for cost versus detection power. Long oligonucleotide arrays were slightly more sensitive at detecting very low enrichment. On all platforms, simple sequence repeats and genome redundancy tended to result in false positives. LM-PCR and WGA, the most popular sample amplification techniques, reproduced relative enrichment levels with high fidelity. Performance among signal detection algorithms was heavily dependent on array platform. The spike-in DNA samples and the data presented here provide a stable benchmark against which future ChIP platforms, protocol improvements, and analysis methods can be evaluated.


Subject(s)
Chromatin Immunoprecipitation/methods , Oligonucleotide Array Sequence Analysis/methods , Algorithms , Chromosome Aberrations , DNA/chemistry , Genome, Human , Humans , Oligonucleotide Probes , Polymerase Chain Reaction , ROC Curve , Reproducibility of Results , Tandem Repeat Sequences
14.
PLoS Genet ; 3(8): e136, 2007 Aug.
Article in English | MEDLINE | ID: mdl-17708682

ABSTRACT

The identification of regulatory elements from different cell types is necessary for understanding the mechanisms controlling cell type-specific and housekeeping gene expression. Mapping DNaseI hypersensitive (HS) sites is an accurate method for identifying the location of functional regulatory elements. We used a high throughput method called DNase-chip to identify 3,904 DNaseI HS sites from six cell types across 1% of the human genome. A significant number (22%) of DNaseI HS sites from each cell type are ubiquitously present among all cell types studied. Surprisingly, nearly all of these ubiquitous DNaseI HS sites correspond to either promoters or insulator elements: 86% of them are located near annotated transcription start sites and 10% are bound by CTCF, a protein with known enhancer-blocking insulator activity. We also identified a large number of DNaseI HS sites that are cell type specific (only present in one cell type); these regions are enriched for enhancer elements and correlate with cell type-specific gene expression as well as cell type-specific histone modifications. Finally, we found that approximately 8% of the genome overlaps a DNaseI HS site in at least one the six cell lines studied, indicating that a significant percentage of the genome is potentially functional.


Subject(s)
Chromatin/chemistry , Genome, Human , Organ Specificity/genetics , Regulatory Elements, Transcriptional , Base Sequence , Binding Sites , CCCTC-Binding Factor , Cell Lineage/genetics , Cells, Cultured , Chromosome Mapping , Cluster Analysis , CpG Islands/genetics , DNA-Binding Proteins/metabolism , Deoxyribonuclease I/metabolism , HeLa Cells , Humans , Insulator Elements/genetics , K562 Cells , Microarray Analysis , Molecular Sequence Data , Repressor Proteins/metabolism , Research Design , Sequence Analysis, DNA/methods
15.
Genome Res ; 17(8): 1170-7, 2007 Aug.
Article in English | MEDLINE | ID: mdl-17620451

ABSTRACT

Although histones can form nucleosomes on virtually any genomic sequence, DNA sequences show considerable variability in their binding affinity. We have used DNA sequences of Saccharomyces cerevisiae whose nucleosome binding affinities have been experimentally determined (Yuan et al. 2005) to train a support vector machine to identify the nucleosome formation potential of any given sequence of DNA. The DNA sequences whose nucleosome formation potential are most accurately predicted are those that contain strong nucleosome forming or inhibiting signals and are found within nucleosome length stretches of genomic DNA with continuous nucleosome formation or inhibition signals. We have accurately predicted the experimentally determined nucleosome positions across a well-characterized promoter region of S. cerevisiae and identified strong periodicity within 199 center-aligned mononucleosomes studied recently (Segal et al. 2006) despite there being no periodicity information used to train the support vector machine. Our analysis suggests that only a subset of nucleosomes are likely to be positioned by intrinsic sequence signals. This observation is consistent with the available experimental data and is inconsistent with the proposal of a nucleosome positioning code. Finally, we show that intrinsic nucleosome positioning signals are both more inhibitory and more variable in promoter regions than in open reading frames in S. cerevisiae.


Subject(s)
DNA, Fungal/chemistry , Genome, Fungal , Nucleosomes/genetics , DNA, Fungal/metabolism , Markov Chains , Nucleosomes/metabolism , Promoter Regions, Genetic , Saccharomyces cerevisiae/genetics , Saccharomyces cerevisiae/metabolism
16.
Genome Res ; 17(6): 787-97, 2007 Jun.
Article in English | MEDLINE | ID: mdl-17567997

ABSTRACT

The comprehensive inventory of functional elements in 44 human genomic regions carried out by the ENCODE Project Consortium enables for the first time a global analysis of the genomic distribution of transcriptional regulatory elements. In this study we developed an intuitive and yet powerful approach to analyze the distribution of regulatory elements found in many different ChIP-chip experiments on a 10 approximately 100-kb scale. First, we focus on the overall chromosomal distribution of regulatory elements in the ENCODE regions and show that it is highly nonuniform. We demonstrate, in fact, that regulatory elements are associated with the location of known genes. Further examination on a local, single-gene scale shows an enrichment of regulatory elements near both transcription start and end sites. Our results indicate that overall these elements are clustered into regulatory rich "islands" and poor "deserts." Next, we examine how consistent the nonuniform distribution is between different transcription factors. We perform on all the factors a multivariate analysis in the framework of a biplot, which enhances biological signals in the experiments. This groups transcription factors into sequence-specific and sequence-nonspecific clusters. Moreover, with experimental variation carefully controlled, detailed correlations show that the distribution of sites was generally reproducible for a specific factor between different laboratories and microarray platforms. Data sets associated with histone modifications have particularly strong correlations. Finally, we show how the correlations between factors change when only regulatory elements far from the transcription start sites are considered.


Subject(s)
Chromosomes, Human/genetics , Genome, Human , Regulatory Elements, Transcriptional , Sequence Analysis, DNA , Transcription, Genetic , Databases, Genetic , Humans , Oligonucleotide Array Sequence Analysis
17.
Genome Res ; 17(6): 798-806, 2007 Jun.
Article in English | MEDLINE | ID: mdl-17567998

ABSTRACT

A set of 723 high-quality human core promoter sequences were compiled and analyzed for overrepresented motifs. Beside the two well-characterized core promoter motifs (TATA and Inr), several known motifs (YY1, Sp1, NRF-1, NRF-2, CAAT, and CREB) and one potentially new motif (motif8) were found. Interestingly, YY1 and motif8 mostly reside immediately downstream from the TSS. In particular, the YY1 motif occurs primarily in genes with 5'-UTRs shorter than 40 base pairs (bp) and its locations coincide with the translation start site. We verified that the YY1 motif is bound by YY1 in vitro. We then performed detailed analysis on YY1 chromatin immunoprecipitation data with a whole-genome human promoter microarray (ChIP-chip) and revealed that the thus identified promoters in HeLa cells were highly enriched with the YY1 motif. Moreover, the motif overlapped with the translation start sites on the plus strand of a group of genes, many with short 5'-UTRs, and with the transcription start sites on the minus strand of another distinct group of genes; together, the two groups of genes accounted for the majority of the YY1-bound promoters in the ChIP-chip data. Furthermore, the first group of genes was highly enriched in the functional categories of ribosomal proteins and nuclear-encoded mitochondria proteins. We suggest that the YY1 motif plays a dual role in both transcription and translation initiation of these genes. We also discuss the evolutionary advantages of housing a transcriptional element inside the transcript in terms of the migration of these genes in the human genome.


Subject(s)
5' Untranslated Regions/genetics , Evolution, Molecular , Genome, Human , Response Elements , Transcription, Genetic , Gene Expression Profiling , Humans , Mitochondrial Proteins/genetics , Oligonucleotide Array Sequence Analysis , Ribosomal Proteins/genetics , TATA Box
18.
Genome Res ; 17(6): 818-27, 2007 Jun.
Article in English | MEDLINE | ID: mdl-17568000

ABSTRACT

Bidirectional promoters have received considerable attention because of their ability to regulate two downstream genes (divergent genes). They are also highly abundant, directing the transcription of approximately 11% of genes in the human genome. We categorized the presence of DNA sequence motifs, binding of transcription factors, and modified histones as overrepresented, shared, or underrepresented in bidirectional promoters with respect to unidirectional promoters. We found that a small set of motifs, including GABPA, MYC, E2F1, E2F4, NRF-1, CCAAT, YY1, and ACTACAnnTCC are overrepresented in bidirectional promoters, while the majority (73%) of known vertebrate motifs are underrepresented. We performed chromatin-immunoprecipitation (ChIP), followed by quantitative PCR for GABPA, on 118 regions in the human genome and showed that it binds to bidirectional promoters more frequently than unidirectional promoters, and its position-specific scoring matrix is highly predictive of binding. Signatures of active transcription, such as occupancy of RNA polymerase II and the modified histones H3K4me2, H3K4me3, and H3ac, are overrepresented in regions around bidirectional promoters, suggesting that a higher fraction of divergent genes are transcribed in a given cell than the fraction of other genes. Accordingly, analysis of whole-genome microarray data indicates that 68% of divergent genes are transcribed compared with 44% of all human genes. By combining the analysis of publicly available ENCODE data and a detailed study of GABPA, we survey bidirectional promoters with breadth and depth, leading to biological insights concerning their motif composition and bidirectional regulatory mode.


Subject(s)
Genome, Human , Histones/genetics , Protein Processing, Post-Translational/physiology , Response Elements , Transcription Factors/genetics , Chromatin Immunoprecipitation , Databases, Genetic , Humans , Protein Binding/genetics , RNA Polymerase II/genetics
19.
Nat Genet ; 39(3): 311-8, 2007 Mar.
Article in English | MEDLINE | ID: mdl-17277777

ABSTRACT

Eukaryotic gene transcription is accompanied by acetylation and methylation of nucleosomes near promoters, but the locations and roles of histone modifications elsewhere in the genome remain unclear. We determined the chromatin modification states in high resolution along 30 Mb of the human genome and found that active promoters are marked by trimethylation of Lys4 of histone H3 (H3K4), whereas enhancers are marked by monomethylation, but not trimethylation, of H3K4. We developed computational algorithms using these distinct chromatin signatures to identify new regulatory elements, predicting over 200 promoters and 400 enhancers within the 30-Mb region. This approach accurately predicted the location and function of independently identified regulatory elements with high sensitivity and specificity and uncovered a novel functional enhancer for the carnitine transporter SLC22A5 (OCTN2). Our results give insight into the connections between chromatin modifications and transcriptional regulatory activity and provide a new tool for the functional annotation of the human genome.


Subject(s)
Algorithms , Chromatin/metabolism , Enhancer Elements, Genetic , Genome, Human , Promoter Regions, Genetic , Genomics , Histones/metabolism , Humans , Models, Genetic , Organic Cation Transport Proteins/genetics , Organic Cation Transport Proteins/metabolism , Solute Carrier Family 22 Member 5
20.
Proc Natl Acad Sci U S A ; 103(47): 17834-9, 2006 Nov 21.
Article in English | MEDLINE | ID: mdl-17093053

ABSTRACT

The protooncogene MYC encodes the c-Myc transcription factor that regulates cell growth, cell proliferation, cell cycle, and apoptosis. Although deregulation of MYC contributes to tumorigenesis, it is still unclear what direct Myc-induced transcriptomes promote cell transformation. Here we provide a snapshot of genome-wide, unbiased characterization of direct Myc binding targets in a model of human B lymphoid tumor using ChIP coupled with pair-end ditag sequencing analysis (ChIP-PET). Myc potentially occupies > 4,000 genomic loci with the majority near proximal promoter regions associated frequently with CpG islands. Using gene expression profiles with ChIP-PET, we identified 668 direct Myc-regulated gene targets, including 48 transcription factors, indicating that Myc is a central transcriptional hub in growth and proliferation control. This first global genomic view of Myc binding sites yields insights of transcriptional circuitries and cis regulatory modules involving Myc and provides a substantial framework for our understanding of mechanisms of Myc-induced tumorigenesis.


Subject(s)
B-Lymphocytes/physiology , Chromosome Mapping , Gene Expression Regulation , Proto-Oncogene Proteins c-myc/metabolism , Binding Sites , Chromatin Immunoprecipitation/methods , CpG Islands , Genome, Human , Humans , MicroRNAs/metabolism , Promoter Regions, Genetic , Sequence Analysis, DNA/methods , Transcription Factors/genetics , Transcription Factors/metabolism
SELECTION OF CITATIONS
SEARCH DETAIL
...