Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 95
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
Sci Rep ; 13(1): 6316, 2023 04 18.
Artigo em Inglês | MEDLINE | ID: mdl-37072456

RESUMO

All current categorizations of human population, such as ethnicity, ancestry and race, are based on various selections and combinations of complex and dynamic common characteristics, that are mostly societal and cultural in nature, perceived by the members within or from outside of the categorized group. During the last decade, a massive amount of a new type of characteristics, that are exclusively genomic in nature, became available that allows us to analyze the inherited whole-genome demographics of extant human, especially in the fields such as human genetics, health sciences and medical practices (e.g., 1,2,3), where such health-related characteristics can be related to whole-genome-based categorization. Here we show the feasibility of deriving such whole-genome-based categorization. We observe that, within the available genomic data at present, (a) the study populations form about 14 genomic groups, each consisting of multiple ethnic groups; and (b), at an individual level, approximately 99.8%, on average, of the whole autosomal-genome contents are identical between any two individuals regardless of their genomic or ethnic groups.


Assuntos
Etnicidade , Genômica , Humanos , Etnicidade/genética , Genoma Humano
3.
Proc Natl Acad Sci U S A ; 117(7): 3678-3686, 2020 02 18.
Artigo em Inglês | MEDLINE | ID: mdl-32019884

RESUMO

An organism tree of life (organism ToL) is a conceptual and metaphorical tree to capture a simplified narrative of the evolutionary course and kinship among the extant organisms. Such a tree cannot be experimentally validated but may be reconstructed based on characteristics associated with the organisms. Since the whole-genome sequence of an organism is, at present, the most comprehensive descriptor of the organism, a whole-genome sequence-based ToL can be an empirically derivable surrogate for the organism ToL. However, experimentally determining the whole-genome sequences of many diverse organisms was practically impossible until recently. We have constructed three types of ToLs for diversely sampled organisms using the sequences of whole genome, of whole transcriptome, and of whole proteome. Of the three, whole-proteome sequence-based ToL (whole-proteome ToL), constructed by applying information theory-based feature frequency profile method, an "alignment-free" method, gave the most topologically stable ToL. Here, we describe the main features of a whole-proteome ToL for 4,023 species with known complete or almost complete genome sequences on grouping and kinship among the groups at deep evolutionary levels. The ToL reveals 1) all extant organisms of this study can be grouped into 2 "Supergroups," 6 "Major Groups," or 35+ "Groups"; 2) the order of emergence of the "founders" of all of the groups may be assigned on an evolutionary progression scale; 3) all of the founders of the groups have emerged in a "deep burst" at the very beginning period near the root of the ToL-an explosive birth of life's diversity.


Assuntos
Bactérias/genética , Eucariotos/genética , Fungos/genética , Genoma , Filogenia , Plantas/genética , Proteoma/genética , Animais , Bactérias/classificação , Bactérias/isolamento & purificação , Biodiversidade , Eucariotos/classificação , Fungos/classificação , Plantas/classificação , Proteoma/metabolismo
4.
Genome Biol ; 20(1): 144, 2019 07 25.
Artigo em Inglês | MEDLINE | ID: mdl-31345254

RESUMO

BACKGROUND: Alignment-free (AF) sequence comparison is attracting persistent interest driven by data-intensive applications. Hence, many AF procedures have been proposed in recent years, but a lack of a clearly defined benchmarking consensus hampers their performance assessment. RESULTS: Here, we present a community resource (http://afproject.org) to establish standards for comparing alignment-free approaches across different areas of sequence-based research. We characterize 74 AF methods available in 24 software tools for five research applications, namely, protein sequence classification, gene tree inference, regulatory element detection, genome-based phylogenetic inference, and reconstruction of species trees under horizontal gene transfer and recombination events. CONCLUSION: The interactive web service allows researchers to explore the performance of alignment-free tools relevant to their data types and analytical goals. It also allows method developers to assess their own algorithms and compare them with current state-of-the-art tools, accelerating the development of new, more accurate AF solutions.


Assuntos
Análise de Sequência , Benchmarking , Transferência Genética Horizontal , Internet , Filogenia , Sequências Reguladoras de Ácido Nucleico , Alinhamento de Sequência , Análise de Sequência de Proteína , Software
5.
Proc Natl Acad Sci U S A ; 115(6): 1322-1327, 2018 02 06.
Artigo em Inglês | MEDLINE | ID: mdl-29358382

RESUMO

Prevention and early intervention are the most effective ways of avoiding or minimizing psychological, physical, and financial suffering from cancer. However, such proactive action requires the ability to predict the individual's susceptibility to cancer with a measure of probability. Of the triad of cancer-causing factors (inherited genomic susceptibility, environmental factors, and lifestyle factors), the inherited genomic component may be derivable from the recent public availability of a large body of whole-genome variation data. However, genome-wide association studies have so far showed limited success in predicting the inherited susceptibility to common cancers. We present here a multiple classification approach for predicting individuals' inherited genomic susceptibility to acquire the most likely phenotype among a panel of 20 major common cancer types plus 1 "healthy" type by application of a supervised machine-learning method under competing conditions among the cohorts of the 21 types. This approach suggests that, depending on the phenotypes of 5,919 individuals of "white" ethnic population in this study, (i) the portion of the cohort of a cancer type who acquired the observed type due to mostly inherited genomic susceptibility factors ranges from about 33 to 88% (or its corollary: the portion due to mostly environmental and lifestyle factors ranges from 12 to 67%), and (ii) on an individual level, the method also predicts individuals' inherited genomic susceptibility to acquire the other types ranked with associated probabilities. These probabilities may provide practical information for individuals, heath professionals, and health policymakers related to prevention and/or early intervention of cancer.


Assuntos
Predisposição Genética para Doença , Aprendizado de Máquina , Neoplasias/genética , Polimorfismo de Nucleotídeo Único , Genoma Humano , Humanos , Estilo de Vida , Probabilidade
6.
Proc Natl Acad Sci U S A ; 114(35): 9391-9396, 2017 08 29.
Artigo em Inglês | MEDLINE | ID: mdl-28808018

RESUMO

Fungi belong to one of the largest and most diverse kingdoms of living organisms. The evolutionary kinship within a fungal population has so far been inferred mostly from the gene-information-based trees ("gene trees"), constructed commonly based on the degree of differences of proteins or DNA sequences of a small number of highly conserved genes common among the population by a multiple sequence alignment (MSA) method. Since each gene evolves under different evolutionary pressure and time scale, it has been known that one gene tree for a population may differ from other gene trees for the same population depending on the subjective selection of the genes. Within the last decade, a large number of whole-genome sequences of fungi have become publicly available, which represent, at present, the most fundamental and complete information about each fungal organism. This presents an opportunity to infer kinship among fungi using a whole-genome information-based tree ("genome tree"). The method we used allows comparison of whole-genome information without MSA, and is a variation of a computational algorithm developed to find semantic similarities or plagiarism in two books, where we represent whole-genomic information of an organism as a book of words without spaces. The genome tree reveals several significant and notable differences from the gene trees, and these differences invoke new discussions about alternative narratives for the evolution of some of the currently accepted fungal groups.


Assuntos
Fungos/genética , Genoma Fúngico , Filogenia , DNA Fúngico , Proteínas Fúngicas , Proteoma
7.
Proc Natl Acad Sci U S A ; 111(5): 1921-6, 2014 Feb 04.
Artigo em Inglês | MEDLINE | ID: mdl-24449885

RESUMO

An empirical approach is presented for predicting the genomic susceptibility of an individual to the most likely one among nine traits, consisting of eight major cancer classes plus a healthy trait. We use four prediction methods by applying two supervised learning algorithms to two different descriptors of common genomic variations (the profiles of genotypes of SNPs and SNP syntaxes with low P values or low frequencies) of each individual genome from normal cells. All four methods made correct predictions substantially better than random predictions for most cancer classes, but not for some others. A combination of the four results using Bayesian inference better predicted overall than any individual method. The multiclass accuracy of the combined prediction ranges from 33% to 56% depending on cancer classes of testing sets, compared with 11% for a random prediction among nine traits. Despite limited SNP data available and the absence of rare SNPs in public databases, at present, the results suggest that the framework of this approach or its improvement can predict cancer susceptibility with probability estimates useful for making health decisions for individuals or for a population.


Assuntos
Predisposição Genética para Doença , Modelos Genéticos , Neoplasias/classificação , Neoplasias/genética , Algoritmos , Alelos , Intervalos de Confiança , Genoma Humano/genética , Humanos , Polimorfismo de Nucleotídeo Único/genética
8.
Proc Natl Acad Sci U S A ; 111(5): 1778-83, 2014 Feb 04.
Artigo em Inglês | MEDLINE | ID: mdl-24434556

RESUMO

The potential for pluripotent cells to differentiate into diverse specialized cell types has given much hope to the field of regenerative medicine. Nevertheless, the low efficiency of cell commitment has been a major bottleneck in this field. Here we provide a strategy to enhance the efficiency of early differentiation of pluripotent cells. We hypothesized that the initial phase of differentiation can be enhanced if the transcriptional activity of master regulators of stemness is suppressed, blocking the formation of functional transcriptomes. However, an obstacle is the lack of an efficient strategy to block protein-protein interactions. In this work, we take advantage of the biochemical property of seventeen kilodalton protein (Skp), a bacterial molecular chaperone that binds directly to sex determining region Y-box 2 (Sox2). The small angle X-ray scattering analyses provided a low resolution model of the complex and suggested that the transactivation domain of Sox2 is probably wrapped in a cleft on Skp trimer. Upon the transduction of Skp into pluripotent cells, the transcriptional activity of Sox2 was inhibited and the expression of Sox2 and octamer-binding transcription factor 4 was reduced, which resulted in the expression of early differentiation markers and appearance of early neuronal and cardiac progenitors. These results suggest that the initial stage of differentiation can be accelerated by inhibiting master transcription factors of stemness. This strategy can possibly be applied to increase the efficiency of stem cell differentiation into various cell types and also provides a clue to understanding the mechanism of early differentiation.


Assuntos
Diferenciação Celular , Células-Tronco Pluripotentes/citologia , Células-Tronco Pluripotentes/metabolismo , Fatores de Transcrição/metabolismo , Animais , Proteínas de Escherichia coli/metabolismo , Camundongos , Modelos Biológicos , Modelos Moleculares , Fatores de Transcrição SOXB1/metabolismo , Espalhamento a Baixo Ângulo , Soluções , Transdução Genética , Difração de Raios X , Produtos do Gene tat do Vírus da Imunodeficiência Humana/metabolismo
9.
Proc Natl Acad Sci U S A ; 108(20): 8329-34, 2011 May 17.
Artigo em Inglês | MEDLINE | ID: mdl-21536867

RESUMO

A whole-genome phylogeny of the Escherichia coli/Shigella group was constructed by using the feature frequency profile (FFP) method. This alignment-free approach uses the frequencies of l-mer features of whole genomes to infer phylogenic distances. We present two phylogenies that accentuate different aspects of E. coli/Shigella genomic evolution: (i) one based on the compositions of all possible features of length l = 24 (∼8.4 million features), which are likely to reveal the phenetic grouping and relationship among the organisms and (ii) the other based on the compositions of core features with low frequency and low variability (∼0.56 million features), which account for ∼69% of all commonly shared features among 38 taxa examined and are likely to have genome-wide lineal evolutionary signal. Shigella appears as a single clade when all possible features are used without filtering of noncore features. However, results using core features show that Shigella consists of at least two distantly related subclades, implying that the subclades evolved into a single clade because of a high degree of convergence influenced by mobile genetic elements and niche adaptation. In both FFP trees, the basal group of the E. coli/Shigella phylogeny is the B2 phylogroup, which contains primarily uropathogenic strains, suggesting that the E. coli/Shigella ancestor was likely a facultative or opportunistic pathogen. The extant commensal strains diverged relatively late and appear to be the result of reductive evolution of genomes. We also identify clade distinguishing features and their associated genomic regions within each phylogroup. Such features may provide useful information for understanding evolution of the groups and for quick diagnostic identification of each phylogroup.


Assuntos
Escherichia coli/genética , Genoma Bacteriano , Modelos Genéticos , Filogenia , Shigella/genética , Evolução Biológica
10.
Proc Natl Acad Sci U S A ; 108(1): 296-301, 2011 Jan 04.
Artigo em Inglês | MEDLINE | ID: mdl-21173226

RESUMO

Despite the safety and feasibility of mesenchymal stem cell (MSC) therapy, an optimal cell type has not yet emerged in terms of electromechanical integration in infarcted myocardium. We found that poor to moderate survival benefits of MSC-implanted rats were caused by incomplete electromechanical integration induced by tissue heterogeneity between myocytes and engrafted MSCs in the infarcted myocardium. Here, we report the development of cardiogenic cells from rat MSCs activated by phorbol myristate acetate, a PKC activator, that exhibited high expressions of cardiac-specific markers and Ca(2+) homeostasis-related proteins and showed adrenergic receptor signaling by norepinephrine. Histological analysis showed high connexin 43 coupling, few inflammatory cells, and low fibrotic markers in myocardium implanted with these phorbol myristate acetate-activated MSCs. Infarct hearts implanted with these cells exhibited restoration of conduction velocity through decreased tissue heterogeneity and improved myocardial contractility. These findings have major implications for the development of better cell types for electromechanical integration of cell-based treatment for infarcted myocardium.


Assuntos
Terapia Baseada em Transplante de Células e Tecidos/métodos , Células-Tronco Mesenquimais/metabolismo , Contração Miocárdica/fisiologia , Infarto do Miocárdio/terapia , Miócitos Cardíacos/fisiologia , Análise de Variância , Animais , Western Blotting , Conexina 43/metabolismo , Citocinas/imunologia , Eletrocardiografia , Ensaio de Imunoadsorção Enzimática , Imunofluorescência , Marcação In Situ das Extremidades Cortadas , Isoproterenol/farmacologia , Masculino , Transplante de Células-Tronco Mesenquimais/métodos , Células-Tronco Mesenquimais/citologia , Contração Miocárdica/efeitos dos fármacos , Miócitos Cardíacos/citologia , Norepinefrina/metabolismo , Ratos , Ratos Sprague-Dawley , Receptores Adrenérgicos/metabolismo , Reação em Cadeia da Polimerase Via Transcriptase Reversa , Acetato de Tetradecanoilforbol/metabolismo
11.
Proc Natl Acad Sci U S A ; 107(1): 133-8, 2010 Jan 05.
Artigo em Inglês | MEDLINE | ID: mdl-20018669

RESUMO

We present a whole-proteome phylogeny of prokaryotes constructed by comparing feature frequency profiles (FFPs) of whole proteomes. Features are l-mers of amino acids, and each organism is represented by a profile of frequencies of all features. The selection of feature length is critical in the FFP method, and we have developed a procedure for identifying the optimal feature lengths for inferring the phylogeny of prokaryotes, strictly speaking, a proteome phylogeny. Our FFP trees are constructed with whole proteomes of 884 prokaryotes, 16 unicellular eukaryotes, and 2 random sequences. To highlight the branching order of major groups, we present a simplified proteome FFP tree of monophyletic class or phylum with branch support. In our whole-proteome FFP trees (i) Archaea, Bacteria, Eukaryota, and a random sequence outgroup are clearly separated; (ii) Archaea and Bacteria form a sister group when rooted with random sequences; (iii) Planctomycetes, which possesses an intracellular membrane compartment, is placed at the basal position of the Bacteria domain; (iv) almost all groups are monophyletic in prokaryotes at most taxonomic levels, but many differences in the branching order of major groups are observed between our proteome FFP tree and trees built with other methods; and (v) previously "unclassified" genomes may be assigned to the most likely taxa. We describe notable similarities and differences between our FFP trees and those based on other methods in grouping and phylogeny of prokaryotes.


Assuntos
Filogenia , Células Procarióticas , Proteoma/genética , Proteômica/métodos , Análise de Sequência de Proteína/métodos , Genoma , Células Procarióticas/classificação , Células Procarióticas/fisiologia , Alinhamento de Sequência/métodos
12.
Proc Natl Acad Sci U S A ; 106(40): 17077-82, 2009 Oct 06.
Artigo em Inglês | MEDLINE | ID: mdl-19805074

RESUMO

Ten complete mammalian genome sequences were compared by using the "feature frequency profile" (FFP) method of alignment-free comparison. This comparison technique reveals that the whole nongenic portion of mammalian genomes contains evolutionary information that is similar to their genic counterparts--the intron and exon regions. We partitioned the complete genomes of mammals (such as human, chimp, horse, and mouse) into their constituent nongenic, intronic, and exonic components. Phylogenic species trees were constructed for each individual component class of genome sequence data as well as the whole genomes by using standard tree-building algorithms with FFP distances. The phylogenies of the whole genomes and each of the component classes (exonic, intronic, and nongenic regions) have similar topologies, within the optimal feature length range, and all agree well with the evolutionary phylogeny based on a recent large dataset, multispecies, and multigene-based alignment. In the strictest sense, the FFP-based trees are genome phylogenies, not species phylogenies. However, the species phylogeny is highly related to the whole-genome phylogeny. Furthermore, our results reveal that the footprints of evolutionary history are spread throughout the entire length of the whole genome of an organism and are not limited to genes, introns, or short, highly conserved, nongenic sequences that can be adversely affected by factors (such as a choice of sequences, homoplasy, and different mutation rates) resulting in inconsistent species phylogenies.


Assuntos
Evolução Molecular , Genoma/genética , Filogenia , Animais , Biologia Computacional/métodos , Éxons , Genômica/métodos , Humanos , Íntrons , Mamíferos/classificação , Mamíferos/genética
13.
Protein Sci ; 18(7): 1370-6, 2009 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-19551896

RESUMO

We have analyzed the interstitial water (ISW) structures in 1500 protein crystal structures deposited in the Protein Data Bank that have greater than 1.5 A resolution with less than 90% sequence similarity with each other. We observed varieties of polygonal water structures composed of three to eight water molecules. These polygons may represent the time- and space-averaged structures of "stable" water oligomers present in liquid water, and their presence as well as relative population may be relevant in understanding physical properties of liquid water at a given temperature. On an average, 13% of ISWs are localized enough to be visible by X-ray diffraction. Of those, averages of 78% are water molecules in the first water layer on the protein surface. Of the localized ISWs beyond the first layer, almost half of them form water polygons such as trigons, tetragons, as well as expected pentagons, hexagons, higher polygons, partial dodecahedrons, and disordered networks. Most of the octagons and nanogons are formed by fusion of smaller polygons. The trigons are most commonly observed. We suggest that our observation provides an experimental basis for including these water polygon structures in correlating and predicting various water properties in liquid state.


Assuntos
Proteínas/química , Água/química , Difração de Raios X/métodos , Bases de Dados de Proteínas , Elétrons , Ligação de Hidrogênio , Modelos Estatísticos , Conformação Molecular , Software
14.
Proc Natl Acad Sci U S A ; 106(31): 12826-31, 2009 Aug 04.
Artigo em Inglês | MEDLINE | ID: mdl-19553209

RESUMO

The vast sequence divergence among different virus groups has presented a great challenge to alignment-based sequence comparison among different virus families. Using an alignment-free comparison method, we construct the whole-proteome phylogeny for a population of viruses from 11 viral families comprising 142 large dsDNA eukaryote viruses. The method is based on the feature frequency profiles (FFP), where the length of the feature (l-mer) is selected to be optimal for phylogenomic inference. We observe that (i) the FFP phylogeny segregates the population into clades, the membership of each has remarkable agreement with current classification by the International Committee on the Taxonomy of Viruses, with one exception that the mimivirus joins the phycodnavirus family; (ii) the FFP tree detects potential evolutionary relationships among some viral families; (iii) the relative position of the 3 herpesvirus subfamilies in the FFP tree differs from gene alignment-based analysis; (iv) the FFP tree suggests the taxonomic positions of certain "unclassified" viruses; and (v) the FFP method identifies candidates for horizontal gene transfer between virus families.


Assuntos
Vírus de DNA/classificação , Filogenia , Proteoma , Baculoviridae/classificação , Vírus de DNA/genética , Transferência Genética Horizontal , Herpesviridae/classificação , Phycodnaviridae/classificação , Poxviridae/classificação , Alinhamento de Sequência
15.
Protein Expr Purif ; 67(2): 164-8, 2009 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-19427902

RESUMO

Nanog and Sox2 are key transcriptional factors involved in self-renewal and pluripotency of stem cells in human and other mammals. Nanog and Sox2 contain homeodomain (HD) and high-mobility group (HMG) DNA-binding domain, respectively, for targeting them to their regulatory regions and the other regions with transactivation function by providing sites for recruiting other transcriptional regulators. To gain insights in the biochemical and biophysical characteristics of the other regions of Nanog and Sox2, we have tried to overproduce and purify full length wild-type human Nanog and Sox2 expressed in Escherichia coli. Interestingly, we found that Nanog and Sox2 were individually stabilized by tight interaction with Skp, an E. coli periplasmic chaperone, thereby enabling stable over-expression and purification of Nanog and Sox2, each in complex with Skp. Purified Skp complexes of Nanog and Sox maintained DNA-binding activity toward its cognate DNA sequence. A similar approach may be applicable for some other mammalian proteins that are unstable or difficult to over-express in E. coli.


Assuntos
Proteínas de Ligação a DNA/isolamento & purificação , Proteínas de Escherichia coli/isolamento & purificação , Proteínas de Homeodomínio/isolamento & purificação , Chaperonas Moleculares/isolamento & purificação , Complexos Multiproteicos/isolamento & purificação , Fatores de Transcrição SOXB1/isolamento & purificação , Sequência de Bases , Proteínas de Ligação a DNA/genética , Proteínas de Ligação a DNA/metabolismo , Ensaio de Desvio de Mobilidade Eletroforética , Escherichia coli/genética , Escherichia coli/metabolismo , Proteínas de Escherichia coli/genética , Proteínas de Escherichia coli/metabolismo , Proteínas de Homeodomínio/genética , Proteínas de Homeodomínio/metabolismo , Humanos , Chaperonas Moleculares/genética , Chaperonas Moleculares/metabolismo , Dados de Sequência Molecular , Complexos Multiproteicos/metabolismo , Proteína Homeobox Nanog , Ligação Proteica , Estabilidade Proteica , Proteínas Recombinantes/genética , Proteínas Recombinantes/isolamento & purificação , Proteínas Recombinantes/metabolismo , Fatores de Transcrição SOXB1/genética , Fatores de Transcrição SOXB1/metabolismo , Solubilidade
16.
J Struct Biol ; 167(2): 159-65, 2009 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-19450691

RESUMO

The Brn-5 protein, highly expressed in human brain, belongs to the POU family; a class of transcription factors involved in a wide variety of biological processes ranging from programming of embryonic stem cells to cellular housekeeping. This functional diversity is conferred by two DNA-binding subdomains that can assume several configurations due to a bipartite arrangement of POU-specific (POU(S)) and POU-homeo (POU(H)) subdomains separated by a linker region. The crystal structure of human Brn-5 transcription factor in complex with corticotrophin-releasing hormone (CRH) gene promoter reveals an unexpected recognition mode of the protein to its cognate DNA. Moreover, the structure also shows the role of the linker in allowing diverse configurations that can be assumed by the two subdomains.


Assuntos
Hormônio Liberador da Corticotropina/genética , Fatores do Domínio POU/química , Regiões Promotoras Genéticas , Cristalografia por Raios X , Humanos , Ligação Proteica , Conformação Proteica , Fatores de Transcrição
17.
Acta Crystallogr D Biol Crystallogr ; 65(Pt 4): 399-402, 2009 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-19307724

RESUMO

The Protein Data Bank file format is the format most widely used by protein crystallographers and biologists to disseminate and manipulate protein structures. Despite this, there are few user-friendly software packages available to efficiently edit and extract raw information from PDB files. This limitation often leads to many protein crystallographers wasting significant time manually editing PDB files. PDB Editor, written in Java Swing GUI, allows the user to selectively search, select, extract and edit information in parallel. Furthermore, the program is a stand-alone application written in Java which frees users from the hassles associated with platform/operating system-dependent installation and usage. PDB Editor can be downloaded from http://sourceforge.net/projects/pdbeditorjl/.


Assuntos
Gráficos por Computador , Bases de Dados de Proteínas , Linguagens de Programação , Interface Usuário-Computador , Cristalografia por Raios X , Internet , Conformação Proteica
18.
Proc Natl Acad Sci U S A ; 106(8): 2677-82, 2009 Feb 24.
Artigo em Inglês | MEDLINE | ID: mdl-19188606

RESUMO

For comparison of whole-genome (genic + nongenic) sequences, multiple sequence alignment of a few selected genes is not appropriate. One approach is to use an alignment-free method in which feature (or l-mer) frequency profiles (FFP) of whole genomes are used for comparison-a variation of a text or book comparison method, using word frequency profiles. In this approach it is critical to identify the optimal resolution range of l-mers for the given set of genomes compared. The optimum FFP method is applicable for comparing whole genomes or large genomic regions even when there are no common genes with high homology. We outline the method in 3 stages: (i) We first show how the optimal resolution range can be determined with English books which have been transformed into long character strings by removing all punctuation and spaces. (ii) Next, we test the robustness of the optimized FFP method at the nucleotide level, using a mutation model with a wide range of base substitutions and rearrangements. (iii) Finally, to illustrate the utility of the method, phylogenies are reconstructed from concatenated mammalian intronic genomes; the FFP derived intronic genome topologies for each l within the optimal range are all very similar. The topology agrees with the established mammalian phylogeny revealing that intron regions contain a similar level of phylogenic signal as do coding regions.


Assuntos
Genoma , Íntrons , Filogenia
19.
Protein Expr Purif ; 63(1): 58-61, 2009 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-18824233

RESUMO

Immobilized metal ion affinity chromatography (IMAC) has become one of the most popular protein purification methods for recombinant proteins with a hexa-histidine tag (His-tag) placed at the C- or N-terminus of proteins. Nevertheless, there are always difficult proteins that show weak binding to the metal chelating resin and thus low purity. These difficulties are often overcome by increasing the His-tag to 8 or 10 histidines. Despite their success, there are only few expression vectors available to easily clone and test different His-tag lengths. Therefore, we have modified Escherichia coli T7 expression vector pET21a to accommodate ligation-independent cloning (LIC) that will allow easy and efficient parallel cloning of target genes with different His-tag lengths using a single insert. Unlike most LIC vectors available commercially, our vectors will not translate unwanted extra sequences by engineering the N-terminal linker to anneal before the open reading frame, and the C-terminal linker to anneal as a His-tag.


Assuntos
Bacteriófago T7/genética , Clonagem Molecular , Vetores Genéticos , Histidina , Proteínas Recombinantes de Fusão/metabolismo , Sequência de Bases , Cromatografia de Afinidade , Escherichia coli/genética , Escherichia coli/metabolismo , Dados de Sequência Molecular , Oligopeptídeos/genética , Biossíntese de Proteínas , Proteínas Recombinantes de Fusão/genética , Solubilidade
20.
J Mol Biol ; 384(5): 1287-300, 2008 Dec 31.
Artigo em Inglês | MEDLINE | ID: mdl-18952098

RESUMO

Many environmentally important photo- and chemolithoautotrophic bacteria accumulate globules of polymeric, water-insoluble sulfur as a transient product during oxidation of reduced sulfur compounds. Oxidation of this sulfur requires the concerted action of Dsr proteins. However, individual functions and interplay of these proteins are largely unclear. We proved with a DeltadsrE mutant experiment that the cytoplasmic alpha2beta2gamma2-structured protein DsrEFH is absolutely essential for the oxidation of sulfur stored in the intracellular sulfur globules of the purple sulfur bacterial model organism Allochromatium vinosum. The ability to degrade stored sulfur was fully regained upon complementation with dsrEFH in trans. The crystal structure of DsrEFH was determined at 2.5 A resolution to assist functional assignment in detail. In conjunction with phylogenetic analyses, two different types of putative active sites were identified in DsrE and DsrH and shown to be characteristic for sulfur-oxidizing bacteria. Conserved Cys78 of A. vinosum DsrE corresponds to the active cysteines of Escherichia coli YchN and TusD. TusBCD and the protein TusE are parts of sulfur relay system involved in thiouridine biosynthesis. DsrEFH interacts with DsrC, a TusE homologue encoded in the same operon. The conserved penultimate cysteine residue in the carboxy-terminus of DsrC is essential for the interaction. Here, we show that Cys78 of DsrE is strictly required for interaction with DsrC while Cys20 in the putative active site of DsrH is dispensable for that reaction. In summary, our findings point at the occurrence of sulfur transfer reactions during sulfur oxidation via the Dsr proteins.


Assuntos
Proteínas de Bactérias/química , Proteínas de Bactérias/genética , Chromatiaceae/genética , Enxofre/metabolismo , Sequência de Aminoácidos , Teorema de Bayes , Domínio Catalítico , Cristalografia por Raios X , Análise Mutacional de DNA , Modelos Moleculares , Dados de Sequência Molecular , Oxirredução , Multimerização Proteica , Estrutura Secundária de Proteína , Homologia de Sequência de Aminoácidos , Eletricidade Estática , Sulfatos/metabolismo
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...