Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 31
Filter
Add more filters










Publication year range
1.
Life (Basel) ; 9(3)2019 Aug 22.
Article in English | MEDLINE | ID: mdl-31443422

ABSTRACT

The early metabolism arising in a Thioester world gave rise to amino acids and their simple peptides. The catalytic activity of these early simple peptides became instrumental in the transition from Thioester World to a Phosphate World. This transition involved the appearances of sugar phosphates, nucleotides, and polynucleotides. The coupling of the amino acids and peptides to nucleotides and polynucleotides is the origin for the genetic code. Many of the key steps in this transition are seen in in the catalytic cores of the nucleotidyltransferases, the class II tRNA synthetases (aaRSs) and the CCA adding enzyme. These catalytic cores are dominated by simple beta hairpin structures formed in the Thioester World. The code evolved from a proto-tRNA a tetramer XCCA interacting with a proto-aminoacyl-tRNA synthetase (aaRS) activating Glycine and Proline, the initial expanded code is found in the acceptor arm of the tRNA, the operational code. It is the coevolution of the tRNA with the aaRSs that is at the heart of the origin and evolution of the genetic code. There is also a close relationship between the accretion models of the evolving tRNA and that of the ribosome.

2.
Cell ; 168(6): 1126-1134.e9, 2017 03 09.
Article in English | MEDLINE | ID: mdl-28262353

ABSTRACT

Phosphate is essential for all living systems, serving as a building block of genetic and metabolic machinery. However, it is unclear how phosphate could have assumed these central roles on primordial Earth, given its poor geochemical accessibility. We used systems biology approaches to explore the alternative hypothesis that a protometabolism could have emerged prior to the incorporation of phosphate. Surprisingly, we identified a cryptic phosphate-independent core metabolism producible from simple prebiotic compounds. This network is predicted to support the biosynthesis of a broad category of key biomolecules. Its enrichment for enzymes utilizing iron-sulfur clusters, and the fact that thermodynamic bottlenecks are more readily overcome by thioester rather than phosphate couplings, suggest that this network may constitute a "metabolic fossil" of an early phosphate-free nonenzymatic biochemistry. Our results corroborate and expand previous proposals that a putative thioester-based metabolism could have predated the incorporation of phosphate and an RNA-based genetic system. PAPERCLIP.


Subject(s)
Computer Simulation , Metabolic Networks and Pathways , Phosphates/metabolism , Adenine Nucleotides/chemistry , Algorithms , Coenzyme A , Coenzymes , Origin of Life , Phosphates/chemistry , Thermodynamics
3.
FEBS Lett ; 589(23): 3499-507, 2015 Nov 30.
Article in English | MEDLINE | ID: mdl-26472323

ABSTRACT

Class II Aminoacyl-tRNA synthetases are a set of very ancient multi domain proteins. The evolution of the catalytic domain of Class II synthetases can be reconstructed from three peptidyl-hairpins. Further evolution from this primordial catalytic core leads to a split of the Class II synthetases into two divisions potentially associated with the operational code. The earliest form of this code likely coded predominantly Glycine (Gly), Proline (Pro), Alanine (Ala) and "Lysine"/Aspartic acid (Lys/Asp). There is a paradox in these synthetases beginning with a hairpin structure before the Genetic Code existed. A resolution is found in the suggestion that the primordial Aminoacyl synthetases formed in a transition from a Thioester world to a Phosphate ester world.


Subject(s)
Amino Acyl-tRNA Synthetases/metabolism , Evolution, Molecular , Amino Acyl-tRNA Synthetases/chemistry , Catalytic Domain , Models, Molecular
4.
Life (Basel) ; 4(2): 227-49, 2014 May 20.
Article in English | MEDLINE | ID: mdl-25370196

ABSTRACT

The evolution of the genetic code is mapped out starting with the aminoacyl tRNA-synthetases and their interaction with the operational code in the tRNA acceptor arm. Combining this operational code with a metric based on the biosynthesis of amino acids from the Citric acid, we come to the conclusion that the earliest genetic code was a Guanine Cytosine (GC) code. This has implications for the likely earliest positively charged amino acids. The progression from this pure GC code to the extant one is traced out in the evolution of the Large Ribosomal Subunit, LSU, and its proteins; in particular those associated with the Peptidyl Transfer Center (PTC) and the nascent peptide exit tunnel. This progression has implications for the earliest encoded peptides and their evolutionary progression into full complex proteins.

5.
J Hum Genet ; 59(5): 288-91, 2014 May.
Article in English | MEDLINE | ID: mdl-24599118

ABSTRACT

Recent reviews discussed the critical roles of apoptosis in human spermatogenesis and infertility. These reviews highlight the FasL-induced caspase cascade in apoptosis lending importance to our discovery of the pseudogene status of the Lfg5 gene in modern humans, Neanderthal and the Denisovan. This gene is a member of the ancient and highly conserved apoptosis Lifeguard family. This pseudogenization is the result of a premature stop codon at the 3'-end of exon 8 not found in any other ortholog. With the current exception of the domesticated bovine and buffalo, Lfg5's expression in mammals is testis-specific. A full analysis of this gene, its phylogenetic context and its recent hominin changes suggest its inactivation was likely under selection in human evolution.


Subject(s)
Evolution, Molecular , Neanderthals/genetics , Organ Specificity/genetics , Testis/metabolism , Animals , Exons , Genomics , Humans , Introns , Male , Multigene Family , Mutation , Phylogeny
6.
PLoS Negl Trop Dis ; 4(8): e782, 2010 Aug 03.
Article in English | MEDLINE | ID: mdl-20689771

ABSTRACT

BACKGROUND: Trichomonas vaginalis has an unusually large genome (approximately 160 Mb) encoding approximately 60,000 proteins. With the goal of beginning to understand why some Trichomonas genes are present in so many copies, we characterized here a family of approximately 123 Trichomonas genes that encode transmembrane adenylyl cyclases (TMACs). METHODOLOGY/PRINCIPAL FINDINGS: The large family of TMACs genes is the result of recent duplications of a small set of ancestral genes that appear to be unique to trichomonads. Duplicated TMAC genes are not closely associated with repetitive elements, and duplications of flanking sequences are rare. However, there is evidence for TMAC gene replacements by homologous recombination. A high percentage of TMAC genes (approximately 46%) are pseudogenes, as they contain stop codons and/or frame shifts, or the genes are truncated. Numerous stop codons present in the genome project G3 strain are not present in orthologous genes of two other Trichomonas strains (S1 and B7RC2). Each TMAC is composed of a series of N-terminal transmembrane helices and a single C-terminal cyclase domain that has adenylyl cyclase activity. Multiple TMAC genes are transcribed by Trichomonas cloned by limiting dilution. CONCLUSIONS/SIGNIFICANCE: We conclude that one reason for the unusually large genome of Trichomonas is the presence of unstable families of genes such as those encoding TMACs that are undergoing massive gene duplication and concomitant development of pseudogenes.


Subject(s)
Adenylyl Cyclases/genetics , Evolution, Molecular , Gene Duplication , Membrane Transport Proteins/genetics , Protozoan Proteins/genetics , Trichomonas vaginalis/enzymology , Trichomonas vaginalis/genetics , Genes, Protozoan , Protein Structure, Tertiary , Pseudogenes , Recombination, Genetic
7.
Biol Direct ; 5: 36, 2010 May 20.
Article in English | MEDLINE | ID: mdl-20487556

ABSTRACT

BACKGROUND: This paper is an attempt to trace the evolution of the ribosome through the evolution of the universal P-loop GTPases that are involved with the ribosome in translation and with the attachment of the ribosome to the membrane. The GTPases involved in translation in Bacteria/Archaea are the elongation factors EFTu/EF1, the initiation factors IF2/aeIF5b + aeIF2, and the elongation factors EFG/EF2. All of these GTPases also contain the OB fold also found in the non GTPase IF1 involved in initiation. The GTPase involved in the signal recognition particle in most Bacteria and Archaea is SRP54. RESULTS: 1) The Elongation Factors of the Archaea based on structural considerations of the domains have the following evolutionary path: EF1--> aeIF2 --> EF2. The evolution of the aeIF5b was a later event; 2) the Elongation Factors of the Bacteria based on the topological considerations of the GTPase domain have a similar evolutionary path: EFTu--> IF-->2-->EFG. These evolutionary sequences reflect the evolution of the LSU followed by the SSU to form the ribosome; 3) the OB-fold IF1 is a mimic of an ancient tRNA minihelix. CONCLUSION: The evolution of translational GTPases of both the Archaea and Bacteria point to the evolution of the ribosome. The elongation factors, EFTu/EF1, began as a Ras-like GTPase bringing the activated minihelix tRNA to the Large Subunit Unit. The initiation factors and elongation factor would then have evolved from the EFTu/EF1 as the small subunit was added to the evolving ribosome. The SRP has an SRP54 GTPase and a specific RNA fold in its RNA component similar to the PTC. We consider the SRP to be a remnant of an ancient form of an LSU bound to a membrane.


Subject(s)
GTP Phosphohydrolases/metabolism , Ribosomes/metabolism , Archaeal Proteins/chemistry , Archaeal Proteins/genetics , Archaeal Proteins/metabolism , Bacterial Proteins/chemistry , Bacterial Proteins/genetics , Bacterial Proteins/metabolism , Evolution, Molecular , GTP Phosphohydrolases/chemistry , GTP Phosphohydrolases/genetics , Ribosomes/genetics
8.
Appl Microbiol Biotechnol ; 86(5): 1387-97, 2010 May.
Article in English | MEDLINE | ID: mdl-20094712

ABSTRACT

Surfactants find wide commercial use as foaming agents, emulsifiers, and dispersants. Currently, surfactants are produced from petroleum, or from seed oils such as palm or coconut oil. Due to concerns with CO(2) emissions and the need to protect rainforests, there is a growing necessity to manufacture these chemicals using sustainable resources In this report, we describe the engineering of a native nonribosomal peptide synthetase pathway (i.e., surfactin synthetase), to generate a Bacillus strain that synthesizes a highly water-soluble acyl amino acid surfactant, rather than the water insoluble lipopeptide surfactin. This novel product has a lower CMC and higher water solubility than myristoyl glutamate, a commercial surfactant. This surfactant is produced by fermentation of cellulosic carbohydrate as feedstock. This method of surfactant production provides an approach to sustainable manufacturing of new surfactants.


Subject(s)
Bacillus subtilis/metabolism , Bacterial Proteins/genetics , Lipopeptides/biosynthesis , Peptide Synthases/genetics , Peptides, Cyclic/biosynthesis , Surface-Active Agents/metabolism , Amino Acid Sequence , Bacillus subtilis/enzymology , Bacillus subtilis/genetics , Bacterial Proteins/metabolism , Cellulose/metabolism , Fermentation , Glutamic Acid/analogs & derivatives , Glutamic Acid/chemistry , Glutamic Acid/metabolism , Lipopeptides/chemistry , Lipopeptides/metabolism , Micelles , Molecular Sequence Data , Peptide Synthases/metabolism , Peptides, Cyclic/chemistry , Protein Engineering , Solubility , Surface-Active Agents/chemistry
9.
Apoptosis ; 14(11): 1255-65, 2009 Nov.
Article in English | MEDLINE | ID: mdl-19784873

ABSTRACT

The expanding wealth of human, model and other organism's genomic data has allowed the identification of a distinct gene family of apoptotic related genes. Most of these genes are currently unannotated or have been subsumed under two questionably related gene families in the past. For example the transmembrane Bax inhibitor 1 (BI1) motif family has been reported to play a role in apoptosis and to consist of at least seven mammalian protein genes, GRINA, BI1, Lfg/FAIM2, Ghitm, RESC1/Tmbim1, GAAP/Tmbim4, and Tmbm1b. However, a detailed sequence and phylogenetic analysis shows that only five of these form a clear and unique protein family. This now provides information for understanding and investigating the biological roles of these proteins across a wide range of tissues in model organisms. The evolutionary relationships among these genes provide a powerful prospective for extrapolating to human conditions.


Subject(s)
Apoptosis Regulatory Proteins/genetics , Membrane Proteins/genetics , Amino Acid Sequence , Animals , Consensus Sequence , Evolution, Molecular , Humans , Mice , Molecular Sequence Data , Sequence Alignment , Tissue Distribution
10.
Cell Motil Cytoskeleton ; 66(4): 215-9, 2009 Apr.
Article in English | MEDLINE | ID: mdl-19253335

ABSTRACT

The Cilium, the Nucleus and the Mitochondrion are three important organelles whose evolutionary histories are intimately related to the evolution and origin of the eukaryotic cell. The cilium is involved in motility and sensory transduction. The cilium is only found in the eukaryotic cells. Here we show that eight gene duplications prior to the last common ancestor of all extant eukaryotes account for the expansion of the Heavy Chain Dynein family of motor proteins and the evolution of the complexity of the cilium. The ambiguities in the branching of the phylogenetic tree of the HC-Dyneins were resolved by creating well-defined subtrees and using them to create the full tree. Due to the intimate relationship between the nucleus, the division center, mitosis and the basal body/centriole, the evolution of the cilium can now be related to the evolution of mitosis. In addition, the analysis of the cilium rules out its endosymbiotic origin from a phagocytosis of a bacterium.


Subject(s)
Cilia/genetics , Eukaryotic Cells/physiology , Evolution, Molecular , Animals , Axoneme/genetics , Axoneme/pathology , Chlamydomonas , Cilia/metabolism , Dyneins/genetics , Dyneins/metabolism
11.
Subcell Biochem ; 48: 20-30, 2008.
Article in English | MEDLINE | ID: mdl-18925368

ABSTRACT

The WD-repeat-containing proteins form a very large family that is diverse in both its function and domain structure. Within all these proteins the WD-repeat domains are thought to have two common features: the domain folds into a beta propeller; and the domains form a platform without any catalytic activity on which multiple protein complexes assemble reversibly. The fact that these proteins play such key roles in the formation of protein-protein complexes in nearly all the major pathways and organelles unique to eukaryotic cells has two important implications. It supports both their ancient and proto eukaryotic origins and supports a likely association with many genetic diseases.


Subject(s)
Carrier Proteins/chemistry , Catalysis , Models, Molecular , Protein Conformation
12.
Genetics ; 179(3): 1657-80, 2008 Jul.
Article in English | MEDLINE | ID: mdl-18622036

ABSTRACT

The availability of 12 complete genomes of various species of genus Drosophila provides a unique opportunity to analyze genome-scale chromosomal rearrangements among a group of closely related species. This article reports on the comparison of gene order between these 12 species and on the fixed rearrangement events that disrupt gene order. Three major themes are addressed: the conservation of syntenic blocks across species, the disruption of syntenic blocks (via chromosomal inversion events) and its relationship to the phylogenetic distribution of these species, and the rate of rearrangement events over evolutionary time. Comparison of syntenic blocks across this large genomic data set confirms that genetic elements are largely (95%) localized to the same Muller element across genus Drosophila species and paracentric inversions serve as the dominant mechanism for shuffling the order of genes along a chromosome. Gene-order scrambling between species is in accordance with the estimated evolutionary distances between them and we find it to approximate a linear process over time (linear to exponential with alternate divergence time estimates). We find the distribution of synteny segment sizes to be biased by a large number of small segments with comparatively fewer large segments. Our results provide estimated chromosomal evolution rates across this set of species on the basis of whole-genome synteny analysis, which are found to be higher than those previously reported. Identification of conserved syntenic blocks across these genomes suggests a large number of conserved blocks with varying levels of embryonic expression correlation in Drosophila melanogaster. On the other hand, an analysis of the disruption of syntenic blocks between species allowed the identification of fixed inversion breakpoints and estimates of breakpoint reuse and lineage-specific breakpoint event segregation.


Subject(s)
Chromosomes/genetics , Drosophila/genetics , Gene Rearrangement , Genome, Insect/genetics , Animals , Base Sequence , Chromosome Breakage , Chromosome Inversion , Conserved Sequence , Drosophila/embryology , Embryo, Nonmammalian/metabolism , Gene Expression Regulation, Developmental , Genetic Linkage , Heterochromatin/genetics , Phylogeny , Repetitive Sequences, Nucleic Acid/genetics , Synteny/genetics
13.
Genetics ; 179(3): 1601-55, 2008 Jul.
Article in English | MEDLINE | ID: mdl-18622037

ABSTRACT

The sequencing of the 12 genomes of members of the genus Drosophila was taken as an opportunity to reevaluate the genetic and physical maps for 11 of the species, in part to aid in the mapping of assembled scaffolds. Here, we present an overview of the importance of cytogenetic maps to Drosophila biology and to the concepts of chromosomal evolution. Physical and genetic markers were used to anchor the genome assembly scaffolds to the polytene chromosomal maps for each species. In addition, a computational approach was used to anchor smaller scaffolds on the basis of the analysis of syntenic blocks. We present the chromosomal map data from each of the 11 sequenced non-Drosophila melanogaster species as a series of sections. Each section reviews the history of the polytene chromosome maps for each species, presents the new polytene chromosome maps, and anchors the genomic scaffolds to the cytological maps using genetic and physical markers. The mapping data agree with Muller's idea that the majority of Drosophila genes are syntenic. Despite the conservation of genes within homologous chromosome arms across species, the karyotypes of these species have changed through the fusion of chromosomal arms followed by subsequent rearrangement events.


Subject(s)
Chromosomes/genetics , Drosophila/genetics , Genome, Insect/genetics , Physical Chromosome Mapping , Animals , Genetic Markers , Karyotyping , Sequence Alignment , Synteny
14.
Biol Direct ; 3: 16, 2008 Apr 22.
Article in English | MEDLINE | ID: mdl-18430223

ABSTRACT

BACKGROUND: The origin and early evolution of the active site of the ribosome can be elucidated through an analysis of the ribosomal proteins' taxonomic block structures and their RNA interactions. Comparison between the two subunits, exploiting the detailed three-dimensional structures of the bacterial and archaeal ribosomes, is especially informative. RESULTS: The analysis of the differences between these two sites can be summarized as follows: 1) There is no self-folding RNA segment that defines the decoding site of the small subunit; 2) there is one self-folding RNA segment encompassing the entire peptidyl transfer center of the large subunit; 3) the protein contacts with the decoding site are made by a set of universal alignable sequence blocks of the ribosomal proteins; 4) the majority of those peptides contacting the peptidyl transfer center are made by bacterial or archaeal-specific sequence blocks. CONCLUSION: These clear distinctions between the two subunit active sites support an earlier origin for the large subunit's peptidyl transferase center (PTC) with the decoding site of the small subunit being a later addition to the ribosome. The main implications are that a single self-folding RNA, in conjunction with a few short stabilizing peptides, formed the precursor of the modern ribosomal large subunit in association with a membrane.


Subject(s)
Evolution, Molecular , Ribosomal Proteins/chemistry , Ribosomes/chemistry , Animals , Archaeal Proteins/chemistry , Archaeal Proteins/genetics , Archaeal Proteins/metabolism , Bacterial Proteins/chemistry , Bacterial Proteins/genetics , Bacterial Proteins/metabolism , Base Sequence , Humans , Molecular Sequence Data , Ribosomal Proteins/genetics , Ribosomal Proteins/metabolism , Ribosomes/genetics , Ribosomes/metabolism
15.
Genome Biol ; 8(11): R236, 2007.
Article in English | MEDLINE | ID: mdl-17996033

ABSTRACT

A simple, fast, and biologically inspired computational approach for inferring genome-scale rearrangement phylogeny and ancestral gene order has been developed. This has been applied to eight Drosophila genomes. Existing techniques are either limited to a few hundred markers or a small number of taxa. This analysis uses over 14,000 genomic loci and employs discrete elements consisting of pairs of homologous genetic elements. The results provide insight into evolutionary chromosomal dynamics and synteny analysis, and inform speciation studies.


Subject(s)
Drosophila/genetics , Gene Order , Genome , Phylogeny , Algorithms , Animals , Likelihood Functions , Species Specificity
16.
Genome Res ; 17(12): 1880-7, 2007 Dec.
Article in English | MEDLINE | ID: mdl-17989252

ABSTRACT

During evolution, genome reorganization includes large-scale events such as inversions, translocations, and segmental or even whole-genome duplications, as well as fine-scale events such as the relocation of individual genes. This latter category, which we will refer to as positionally relocated genes (PRGs), is the subject of this report. Assessment of the magnitude of such PRGs and of possible contributing mechanisms is aided by a comparative analysis of related genomes, where conserved chromosomal organization can aid in identifying genes that have acquired a new location in a lineage of these genomes. Here we utilize two methods to comprehensively identify relocated protein-coding genes in the recently sequenced genomes of 12 species of genus Drosophila. We use exceptions to the general rule of maintenance of chromosome arm (Muller element) association for most Drosophila genes to identify one major class of PRGs. We also identify a partially overlapping set of PRGs among "embedded genes," located within the extents of other surrounding genes. We provide evidence that PRG movements have at least two different origins: Some events occur via retrotransposition of processed RNAs and others via a DNA-based transposition mechanism. Overall, we identify several hundred PRGs that arose within a lineage of the genus Drosophila phylogeny and provide suggestive evidence that a few thousand such events have occurred within the radiation of the insect order Diptera, thereby illustrating the magnitude of the contribution of PRG movement to chromosomal reorganization during evolution.


Subject(s)
Evolution, Molecular , Gene Order/genetics , Genome, Insect , Animals , Chromosome Inversion , DNA Transposable Elements/genetics , Drosophila melanogaster/genetics , Phylogeny , Species Specificity , Synteny/genetics , Translocation, Genetic
17.
Genome Inform ; 18: 12-21, 2007.
Article in English | MEDLINE | ID: mdl-18546469

ABSTRACT

Exploiting the ortholog/homolog information now available from the complete genomic sequences of twelve species of Drosophila, we have investigated the ability of regulatory site recognition methods to find regulatory changes for orthologs linked to chromosomal rearrangements. This has made use of the wealth of synteny information among these species. By comparing orthologs in multiple species, we found that the breakpoint of chromosomal rearrangements could have had an impact on regulatory changes of genes next to it with respect to the gene function and location. Extensions of our approach could be used to shed light on the role of gene regulation in the evolutionary adaptation to different environmental conditions.


Subject(s)
Biological Evolution , Drosophila/genetics , Gene Expression Regulation , Animals
18.
Genome Inform ; 18: 35-43, 2007.
Article in English | MEDLINE | ID: mdl-18546472

ABSTRACT

The draft genome of Trichomonas vaginalis was recently published, but not much is known on why it has such a large genome. In part this size is due to many gene family expansions. For example we found over 100 members in the adenylyl cyclase family. About half are complete full length genes, and nearly half are initially confirmed to be pseudogenes, the remaining are either incomplete or the apparent result of assembly or sequencing problems. The family can be divided into two subgroups by sequence similarity. These can then be divided into functional and pseudo genes. Among all four of these sets the cyclase domain is very well conserved. We gave three possible hypotheses for that observation: a) Sequencing error or stop-codon read-through; b) Recency of duplication and mutation; c) The likelihood of functional pseudogene.


Subject(s)
Adenylyl Cyclases/genetics , Genes, Protozoan , Membrane Proteins/genetics , Trichomonas vaginalis/genetics , Animals , Codon, Nonsense , Frameshift Mutation , Gene Duplication , Pseudogenes , Trichomonas vaginalis/enzymology
19.
Archaea ; 2(1): 1-9, 2006 Aug.
Article in English | MEDLINE | ID: mdl-16877317

ABSTRACT

Among the 78 eukaryotic ribosomal proteins, eleven are specific to Eukarya, 33 are common only to Archaea and Eukarya and 34 are homologous (at least in part) to those of both Bacteria and Archaea. Several other translational proteins are common only to Eukarya and Archaea (e.g., IF2a, SRP19, etc.), whereas others are shared by the three phyla (e.g., EFTu/EF1A and SRP54). Although this and other analyses strongly support an archaeal origin for a substantial fraction of the eukaryotic translational machinery, especially the ribosomal proteins, there have been numerous unique and ubiquitous additions to the eukaryotic translational system besides the 11 unique eukaryotic ribosomal proteins. These include peptide additions to most of the 67 archaeal homolog proteins, rRNA insertions, the 5.8S RNA and the Alu extension to the SRP RNA. Our comparative analysis of these and other eukaryotic features among the three different cellular phylodomains supports the idea that an archaeal translational system was most likely incorporated by means of endosymbiosis into a host cell that was neither bacterial nor archaeal in any modern sense. Phylogenetic analyses provide support for the timing of this acquisition coinciding with an ancient bottleneck in prokaryotic diversity.


Subject(s)
Archaea/genetics , Eukaryotic Cells/metabolism , Evolution, Molecular , Protein Biosynthesis , Ribosomal Proteins/chemistry , Ribosomal Proteins/genetics
20.
PLoS Comput Biol ; 2(5): e49, 2006 May.
Article in English | MEDLINE | ID: mdl-16733547

ABSTRACT

We address the problem of comparing and characterizing the promoter regions of genes with similar expression patterns. This remains a challenging problem in sequence analysis, because often the promoter regions of co-expressed genes do not show discernible sequence conservation. In our approach, thus, we have not directly compared the nucleotide sequence of promoters. Instead, we have obtained predictions of transcription factor binding sites, annotated the predicted sites with the labels of the corresponding binding factors, and aligned the resulting sequences of labels--to which we refer here as transcription factor maps (TF-maps). To obtain the global pairwise alignment of two TF-maps, we have adapted an algorithm initially developed to align restriction enzyme maps. We have optimized the parameters of the algorithm in a small, but well-curated, collection of human-mouse orthologous gene pairs. Results in this dataset, as well as in an independent much larger dataset from the CISRED database, indicate that TF-map alignments are able to uncover conserved regulatory elements, which cannot be detected by the typical sequence alignments.


Subject(s)
Computational Biology/methods , Promoter Regions, Genetic , Transcription Factors/genetics , Transcription Factors/metabolism , Algorithms , Animals , Chromosome Mapping , Humans , Mice , Models, Statistical , Restriction Mapping/methods , Software , Species Specificity
SELECTION OF CITATIONS
SEARCH DETAIL
...