Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 25
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
Science ; 358(6366): 1081-1084, 2017 11 24.
Artigo em Inglês | MEDLINE | ID: mdl-29170238

RESUMO

Precise knowledge of the fundamental properties of the proton is essential for our understanding of atomic structure as well as for precise tests of fundamental symmetries. We report on a direct high-precision measurement of the magnetic moment µp of the proton in units of the nuclear magneton µN The result, µp = 2.79284734462 (±0.00000000082) µN, has a fractional precision of 0.3 parts per billion, improves the previous best measurement by a factor of 11, and is consistent with the currently accepted value. This was achieved with the use of an optimized double-Penning trap technique. Provided a similar measurement of the antiproton magnetic moment can be performed, this result will enable a test of the fundamental symmetry between matter and antimatter in the baryonic sector at the 10-10 level.

2.
J Virol ; 88(1): 10-20, 2014 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-24155369

RESUMO

The genome sequences of new viruses often contain many "orphan" or "taxon-specific" proteins apparently lacking homologs. However, because viral proteins evolve very fast, commonly used sequence similarity detection methods such as BLAST may overlook homologs. We analyzed a data set of proteins from RNA viruses characterized as "genus specific" by BLAST. More powerful methods developed recently, such as HHblits or HHpred (available through web-based, user-friendly interfaces), could detect distant homologs of a quarter of these proteins, suggesting that these methods should be used to annotate viral genomes. In-depth manual analyses of a subset of the remaining sequences, guided by contextual information such as taxonomy, gene order, or domain cooccurrence, identified distant homologs of another third. Thus, a combination of powerful automated methods and manual analyses can uncover distant homologs of many proteins thought to be orphans. We expect these methodological results to be also applicable to cellular organisms, since they generally evolve much more slowly than RNA viruses. As an application, we reanalyzed the genome of a bee pathogen, Chronic bee paralysis virus (CBPV). We could identify homologs of most of its proteins thought to be orphans; in each case, identifying homologs provided functional clues. We discovered that CBPV encodes a domain homologous to the Alphavirus methyltransferase-guanylyltransferase; a putative membrane protein, SP24, with homologs in unrelated insect viruses and insect-transmitted plant viruses having different morphologies (cileviruses, higreviruses, blunerviruses, negeviruses); and a putative virion glycoprotein, ORF2, also found in negeviruses. SP24 and ORF2 are probably major structural components of the virions.


Assuntos
Proteínas Virais/genética , Sequência de Aminoácidos , Dados de Sequência Molecular , Fases de Leitura Aberta , Homologia de Sequência de Aminoácidos , Proteínas Virais/química
3.
J Allergy Clin Immunol ; 132(5): 1121-9, 2013 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-24084074

RESUMO

BACKGROUND: Atopic dermatitis (AD) is a major inflammatory condition of the skin caused by inherited skin barrier deficiency, with mutations in the filaggrin gene predisposing to development of AD. Support for barrier deficiency initiating AD came from flaky tail mice, which have a frameshift mutation in Flg and also carry an unknown gene, matted, causing a matted hair phenotype. OBJECTIVE: We sought to identify the matted mutant gene in mice and further define whether mutations in the human gene were associated with AD. METHODS: A mouse genetics approach was used to separate the matted and Flg mutations to produce congenic single-mutant strains for genetic and immunologic analysis. Next-generation sequencing was used to identify the matted gene. Five independently recruited AD case collections were analyzed to define associations between single nucleotide polymorphisms (SNPs) in the human gene and AD. RESULTS: The matted phenotype in flaky tail mice is due to a mutation in the Tmem79/Matt gene, with no expression of the encoded protein mattrin in the skin of mutant mice. Matt(ft) mice spontaneously have dermatitis and atopy caused by a defective skin barrier, with mutant mice having systemic sensitization after cutaneous challenge with house dust mite allergens. Meta-analysis of 4,245 AD cases and 10,558 population-matched control subjects showed that a missense SNP, rs6684514, [corrected] in the human MATT gene has a small but significant association with AD. CONCLUSION: In mice mutations in Matt cause a defective skin barrier and spontaneous dermatitis and atopy. A common SNP in MATT has an association with AD in human subjects.


Assuntos
Dermatite Atópica/genética , Predisposição Genética para Doença , Proteínas de Membrana/genética , Animais , Dermatite Atópica/imunologia , Dermatite Atópica/patologia , Proteínas Filagrinas , Expressão Gênica , Humanos , Masculino , Camundongos , Mutação , Fenótipo , Mapeamento Físico do Cromossomo , Polimorfismo de Nucleotídeo Único , Pele/metabolismo , Pele/patologia
4.
Am J Hum Genet ; 92(5): 820-6, 2013 May 02.
Artigo em Inglês | MEDLINE | ID: mdl-23643385

RESUMO

Myopia, or near-sightedness, is an ocular refractive error of unfocused image quality in front of the retinal plane. Individuals with high-grade myopia (dioptric power greater than -6.00) are predisposed to ocular morbidities such as glaucoma, retinal detachment, and myopic maculopathy. Nonsyndromic, high-grade myopia is highly heritable, and to date multiple gene loci have been reported. We performed exome sequencing in 4 individuals from an 11-member family of European descent from the United States. Affected individuals had a mean dioptric spherical equivalent of -22.00 sphere. A premature stop codon mutation c.157C>T (p.Gln53*) cosegregating with disease was discovered within SCO2 that maps to chromosome 22q13.33. Subsequent analyses identified three additional mutations in three highly myopic unrelated individuals (c.341G>A, c.418G>A, and c.776C>T). To determine differential gene expression in a developmental mouse model, we induced myopia by applying a -15.00D lens over one eye. Messenger RNA levels of SCO2 were significantly downregulated in myopic mouse retinae. Immunohistochemistry in mouse eyes confirmed SCO2 protein localization in retina, retinal pigment epithelium, and sclera. SCO2 encodes for a copper homeostasis protein influential in mitochondrial cytochrome c oxidase activity. Copper deficiencies have been linked with photoreceptor loss and myopia with increased scleral wall elasticity. Retinal thinning has been reported with an SC02 variant. Human mutation identification with support from an induced myopic animal provides biological insights of myopic development.


Assuntos
Proteínas de Transporte/genética , Cromossomos Humanos Par 22/genética , Regulação da Expressão Gênica/genética , Predisposição Genética para Doença/genética , Proteínas Mitocondriais/genética , Miopia/genética , Animais , Sequência de Bases , Códon sem Sentido/genética , Cobre/metabolismo , Exoma/genética , Genes Dominantes/genética , Humanos , Imuno-Histoquímica , Camundongos , Chaperonas Moleculares , Dados de Sequência Molecular , Miopia/patologia , Mutação Puntual/genética , RNA Mensageiro/genética , RNA Mensageiro/metabolismo , Análise de Sequência de DNA , Estados Unidos , População Branca/genética
5.
J Bioinform Comput Biol ; 10(6): 1250020, 2012 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-22867629

RESUMO

Reference proteomes are generated by increasingly sophisticated annotation pipelines as part of regular genome build releases; yet, the corresponding changes in reference proteomes' content are dramatic. In the history of the NCBI-curated human proteome, the total number of entries has remained roughly constant but approximately half of the proteins from the 2003 build 33 are no longer represented by entries in current releases, while about the same number of new proteins have been added (for sequence identity thresholds 50-90%). Although mostly hypothetical proteins are affected, there are also spectacular cases of entry removal/addition of well studied proteins. The changes between the 2003 and recent human proteomes are in a similar order of magnitude as the differences between recent human and chimpanzee proteome releases. As an application example, we show that the proteome fluctuations affect the interpretation (about 74% of hits) of organelle-specific mass-spectrometry data. Although proteome quality tends to improve with more recent releases as, for example, the fraction of proteins with functional annotation has increased over time, existing evidence implies that, apparently, the proteome content still remains incomplete, not just pertaining to isoforms/sequence variants but also to proteins and their families that are clearly distinct.


Assuntos
Proteoma/análise , Proteômica/métodos , Animais , Humanos , Espectrometria de Massas
6.
Nucleic Acids Res ; 40(Web Server issue): W452-7, 2012 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-22689647

RESUMO

The Sorting Intolerant from Tolerant (SIFT) algorithm predicts the effect of coding variants on protein function. It was first introduced in 2001, with a corresponding website that provides users with predictions on their variants. Since its release, SIFT has become one of the standard tools for characterizing missense variation. We have updated SIFT's genome-wide prediction tool since our last publication in 2009, and added new features to the insertion/deletion (indel) tool. We also show accuracy metrics on independent data sets. The original developers have hosted the SIFT web server at FHCRC, JCVI and the web server is currently located at BII. The URL is http://sift-dna.org (24 May 2012, date last accessed).


Assuntos
Substituição de Aminoácidos , Proteínas/química , Software , Algoritmos , Variação Genética , Humanos , Mutação INDEL , Internet , Proteínas/genética , Proteínas/metabolismo , Análise de Sequência de Proteína
7.
Nucleic Acids Res ; 40(Web Server issue): W370-5, 2012 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-22564899

RESUMO

Transmembrane helical segments (TMs) can be classified into two groups of so-called 'simple' and 'complex' TMs. Whereas the first group represents mere hydrophobic anchors with an overrepresentation of aliphatic hydrophobic residues that are likely attributed to convergent evolution in many cases, the complex ones embody ancestral information and tend to have structural and functional roles beyond just membrane immersion. Hence, the sequence homology concept is not applicable on simple TMs. In practice, these simple TMs can attract statistically significant but evolutionarily unrelated hits during similarity searches (whether through BLAST- or HMM-based approaches). This is especially problematic for membrane proteins that contain both globular segments and TMs. As such, we have developed the transmembrane helix: simple or complex (TMSOC) webserver for the identification of simple and complex TMs. By masking simple TM segments in seed sequences prior to sequence similarity searches, the false-discovery rate decreases without sacrificing sensitivity. Therefore, TMSOC is a novel and necessary sequence analytic tool for both the experimentalists and the computational biology community working on membrane proteins. It is freely accessible at http://tmsoc.bii.a-star.edu.sg or available for download.


Assuntos
Proteínas de Membrana/química , Software , Algoritmos , Interações Hidrofóbicas e Hidrofílicas , Internet , Estrutura Secundária de Proteína , Receptor de Colecistocinina A/química , Rodopsina/química , Análise de Sequência de Proteína , Interface Usuário-Computador
8.
Curr Genet ; 58(3): 165-77, 2012 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-22481122

RESUMO

A genome-wide inventory of proteins involved in cell wall synthesis and remodeling has been obtained by taking advantage of the recently released genome sequence of the ectomycorrhizal Tuber melanosporum black truffle. Genes that encode cell wall biosynthetic enzymes, enzymes involved in cell wall polysaccharide synthesis or modification, GPI-anchored proteins and other cell wall proteins were identified in the black truffle genome. As a second step, array data were validated and the symbiotic stage was chosen as the main focus. Quantitative RT-PCR experiments were performed on 29 selected genes to verify their expression during ectomycorrhizal formation. The results confirmed the array data, and this suggests that cell wall-related genes are required for morphogenetic transition from mycelium growth to the ectomycorrhizal branched hyphae. Labeling experiments were also performed on T. melanosporum mycelium and ectomycorrhizae to localize cell wall components.


Assuntos
Ascomicetos/genética , Parede Celular/genética , Genoma Fúngico , Ascomicetos/classificação , Ascomicetos/metabolismo , Ascomicetos/ultraestrutura , Parede Celular/metabolismo , Parede Celular/ultraestrutura , Proteínas Fúngicas/genética , Proteínas Fúngicas/metabolismo , Perfilação da Expressão Gênica , Regulação Fúngica da Expressão Gênica , Genômica , Glucanos/metabolismo , Filogenia , Reprodutibilidade dos Testes
9.
Bioinformatics ; 28(12): 1645-6, 2012 Jun 15.
Artigo em Inglês | MEDLINE | ID: mdl-22531216

RESUMO

UNLABELLED: The usage of current sequence search tools becomes increasingly slower as databases of protein sequences continue to grow exponentially. Tachyon, a new algorithm that identifies closely related protein sequences ~200 times faster than standard BLAST, circumvents this limitation with a reduced database and oligopeptide matching heuristic. AVAILABILITY AND IMPLEMENTATION: The tool is publicly accessible as a webserver at http://tachyon.bii.a-star.edu.sg and can also be accessed programmatically through SOAP.


Assuntos
Algoritmos , Bases de Dados de Proteínas , Ferramenta de Busca , Sequência de Aminoácidos , Humanos , Internet
10.
Methods Mol Biol ; 609: 129-44, 2010.
Artigo em Inglês | MEDLINE | ID: mdl-20221917

RESUMO

From the database point of view, biomolecular pathways are sets of proteins and other biomacromolecules that represent spatio-temporally organized cascades of interactions with the involvement of low-molecular compounds and are responsible for achieving specific phenotypic biological outcomes. A pathway is usually associated with certain subcellular compartments. In this chapter, we analyze the major public biomolecular pathway databases. Special attention is paid to database scope, completeness, issues of annotation reliability, and pathway classification. In addition, systems for information retrieval, tools for mapping user-defined gene sets onto the information in pathway databases, and their typical research applications are reviewed. Whereas today, pathway databases contain almost exclusively qualitative information, the desired trend is toward quantitative description of interactions and reactions in pathways, which will gradually enable predictive modeling and transform the pathway databases into analytical workbenches.


Assuntos
Mineração de Dados , Bases de Dados Genéticas , Bases de Dados de Proteínas , Redes e Vias Metabólicas , Proteínas/metabolismo , Biologia de Sistemas , Animais , Humanos , Disseminação de Informação , Internet , Redes e Vias Metabólicas/genética , Proteínas/genética , Software , Integração de Sistemas , Terminologia como Assunto
11.
Methods Mol Biol ; 609: 145-59, 2010.
Artigo em Inglês | MEDLINE | ID: mdl-20221918

RESUMO

In the current understanding, translation of genomic sequences into proteins is the most important path for realization of genome information. In exercising their intended function, proteins work together through various forms of direct (physical) or indirect interaction mechanisms. For a variety of basic functions, many proteins form a large complex representing a molecular machine or a macromolecular super-structural building block. After several high-throughput techniques for detection of protein-protein interactions had matured, protein interaction data became available in a large scale and curated databases for protein-protein interactions (PPIs) are a new necessity for efficient research. Here, their scope, annotation quality, and retrieval tools are reviewed. In addition, attention is paid to portals that provide unified access to a variety of such databases with added annotation value.


Assuntos
Mineração de Dados , Bases de Dados de Proteínas , Domínios e Motivos de Interação entre Proteínas , Mapeamento de Interação de Proteínas , Proteínas/química , Biologia de Sistemas , Animais , Humanos , Internet , Complexos Multiproteicos , Software , Integração de Sistemas , Terminologia como Assunto
12.
Methods Mol Biol ; 609: 257-67, 2010.
Artigo em Inglês | MEDLINE | ID: mdl-20221924

RESUMO

Given the amount of sequence data available today, in silico function prediction, which often includes detecting distant evolutionary relationships, requires sophisticated bioinformatic workflows. The algorithms behind these workflows exhibit complex data structures; they need the ability to spawn subtasks and tend to demand large amounts of resources. Performing sequence analytic tasks by manually invoking individual function prediction algorithms having to transform between differing input and output formats has become increasingly obsolete. After a period of linking individual predictors using ad hoc scripts, a number of integrated platforms are finally emerging. We present the ANNOTATOR software environment as an advanced example of such a platform.


Assuntos
Biologia Computacional , Mineração de Dados , Bases de Dados Genéticas , Bases de Dados de Proteínas , Análise de Sequência de DNA , Software , Algoritmos , Animais , Inteligência Artificial , Humanos , Modelos Estatísticos , Análise de Sequência de Proteína , Integração de Sistemas
13.
BMC Genomics ; 11 Suppl 1: S13, 2010 Feb 10.
Artigo em Inglês | MEDLINE | ID: mdl-20158870

RESUMO

BACKGROUND: Tandem mass spectrometry (MS/MS) has become a standard method for identification of proteins extracted from biological samples but the huge number and the noise contamination of MS/MS spectra obstruct swift and reliable computer-aided interpretation. Typically, a minor fraction of the spectra per sample (most often, only a few %) and about 10% of the peaks per spectrum contribute to the final result if protein identification is not prevented by the noise at all. RESULTS: Two fast preprocessing screens can substantially reduce the haystack of MS/MS data. (1) Simple sequence ladder rules remove spectra non-interpretable in peptide sequences. (2) Modified Fourier-transform-based criteria clear background in the remaining data. In average, only a remainder of 35% of the MS/MS spectra (each reduced in size by about one quarter) has to be handed over to the interpretation software for reliable protein identification essentially without loss of information, with a trend to improved sequence coverage and with proportional decrease of computer resource consumption. CONCLUSIONS: The search for sequence ladders in tandem MS/MS spectra with subsequent noise suppression is a promising strategy to reduce the number of MS/MS spectra from electro-spray instruments and to enhance the reliability of protein matches. Supplementary material and the software are available from an accompanying WWW-site with the URL http://mendel.bii.a-star.edu.sg/mass-spectrometry/MSCleaner-2.0/.


Assuntos
Peptídeos/análise , Espectrometria de Massas em Tandem/métodos , Internet , Peptídeos/química , Fatores de Tempo
14.
BMC Genomics ; 11 Suppl 1: S15, 2010 Feb 10.
Artigo em Inglês | MEDLINE | ID: mdl-20158872

RESUMO

BACKGROUND: Algorithms designed to predict protein disorder play an important role in structural and functional genomics, as disordered regions have been reported to participate in important cellular processes. Consequently, several methods with different underlying principles for disorder prediction have been independently developed by various groups. For assessing their usability in automated workflows, we are interested in identifying parameter settings and threshold selections, under which the performance of these predictors becomes directly comparable. RESULTS: First, we derived a new benchmark set that accounts for different flavours of disorder complemented with a similar amount of order annotation derived for the same protein set. We show that, using the recommended default parameters, the programs tested are producing a wide range of predictions at different levels of specificity and sensitivity. We identify settings, in which the different predictors have the same false positive rate. We assess conditions when sets of predictors can be run together to derive consensus or complementary predictions. This is useful in the framework of proteome-wide applications where high specificity is required such as in our in-house sequence analysis pipeline and the ANNIE webserver. CONCLUSIONS: This work identifies parameter settings and thresholds for a selection of disorder predictors to produce comparable results at a desired level of specificity over a newly derived benchmark dataset that accounts equally for ordered and disordered regions of different lengths.


Assuntos
Bases de Dados de Proteínas , Proteoma/análise , Proteômica/métodos , Algoritmos , Humanos
15.
Nucleic Acids Res ; 37(Web Server issue): W435-40, 2009 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-19389726

RESUMO

Function prediction of proteins with computational sequence analysis requires the use of dozens of prediction tools with a bewildering range of input and output formats. Each of these tools focuses on a narrow aspect and researchers are having difficulty obtaining an integrated picture. ANNIE is the result of years of close interaction between computational biologists and computer scientists and automates an essential part of this sequence analytic process. It brings together over 20 function prediction algorithms that have proven sufficiently reliable and indispensable in daily sequence analytic work and are meant to give scientists a quick overview of possible functional assignments of sequence segments in the query proteins. The results are displayed in an integrated manner using an innovative AJAX-based sequence viewer. ANNIE is available online at: http://annie.bii.a-star.edu.sg. This website is free and open to all users and there is no login requirement.


Assuntos
Análise de Sequência de Proteína , Software , Algoritmos , Interface Usuário-Computador
16.
PLoS Comput Biol ; 3(4): e66, 2007 Apr 06.
Artigo em Inglês | MEDLINE | ID: mdl-17411337

RESUMO

Three different prenyltransferases attach isoprenyl anchors to C-terminal motifs in substrate proteins. These lipid anchors serve for membrane attachment or protein-protein interactions in many pathways. Although well-tolerated selective prenyltransferase inhibitors are clinically available, their mode of action remains unclear since the known substrate sets of the various prenyltransferases are incomplete. The Prenylation Prediction Suite (PrePS) has been applied for large-scale predictions of prenylated proteins. To prioritize targets for experimental verification, we rank the predictions by their functional importance estimated by evolutionary conservation of the prenylation motifs within protein families. The ranked lists of predictions are accessible as PRENbase (http://mendel.imp.univie.ac.at/sat/PrePS/PRENbase) and can be queried for verification status, type of modifying enzymes (anchor type), and taxonomic distribution. Our results highlight a large group of plant metal-binding chaperones as well as several newly predicted proteins involved in ubiquitin-mediated protein degradation, enriching the known functional repertoire of prenylated proteins. Furthermore, we identify two possibly prenylated proteins in Mimivirus. The section HumanPRENbase provides complete lists of predicted prenylated human proteins-for example, the list of farnesyltransferase targets that cannot become substrates of geranylgeranyltransferase 1 and, therefore, are especially affected by farnesyltransferase inhibitors (FTIs) used in cancer and anti-parasite therapy. We report direct experimental evidence verifying the prediction of the human proteins Prickle1, Prickle2, the BRO1 domain-containing FLJ32421 (termed BROFTI), and Rab28 (short isoform) as exclusive farnesyltransferase targets. We introduce PRENbase, a database of large-scale predictions of protein prenylation substrates ranked by evolutionary conservation of the motif. Experimental evidence is presented for the selective farnesylation of targets with an evolutionary conserved modification site.


Assuntos
Evolução Molecular , Lipídeos de Membrana/química , Proteínas de Membrana/química , Prenilação de Proteína , Alinhamento de Sequência/métodos , Análise de Sequência de Proteína/métodos , Sequência Conservada , Lipídeos de Membrana/metabolismo , Proteínas de Membrana/genética , Proteínas de Membrana/metabolismo , Mapeamento de Interação de Proteínas , Homologia de Sequência de Aminoácidos
17.
Biol Direct ; 2: 1, 2007 Jan 12.
Artigo em Inglês | MEDLINE | ID: mdl-17222345

RESUMO

BACKGROUND: Protein kinase A (cAMP-dependent kinase, PKA) is a serine/threonine kinase, for which ca. 150 substrate proteins are known. Based on a refinement of the recognition motif using the available experimental data, we wished to apply the simplified substrate protein binding model for accurate prediction of PKA phosphorylation sites, an approach that was previously successful for the prediction of lipid posttranslational modifications and of the PTS1 peroxisomal translocation signal. RESULTS: Approximately 20 sequence positions flanking the phosphorylated residue on both sides have been found to be restricted in their sequence variability (region -18...+23 with the site at position 0). The conserved physical pattern can be rationalized in terms of a qualitative binding model with the catalytic cleft of the protein kinase A. Positions -6...+4 surrounding the phosphorylation site are influenced by direct interaction with the kinase in a varying degree. This sequence stretch is embedded in an intrinsically disordered region composed preferentially of hydrophilic residues with flexible backbone and small side chain. This knowledge has been incorporated into a simplified analytical model of productive binding of substrate proteins with PKA. CONCLUSION: The scoring function of the pkaPS predictor can confidently discriminate PKA phosphorylation sites from serines/threonines with non-permissive sequence environments (sensitivity of appoximately 96% at a specificity of approximately 94%). The tool "pkaPS" has been applied on the whole human proteome. Among new predicted PKA targets, there are entirely uncharacterized protein groups as well as apparently well-known families such as those of the ribosomal proteins L21e, L22 and L6. AVAILABILITY: The supplementary data as well as the prediction tool as WWW server are available at http://mendel.imp.univie.ac.at/sat/pkaPS. REVIEWERS: Erik van Nimwegen (Biozentrum, University of Basel, Switzerland), Sandor Pongor (International Centre for Genetic Engineering and Biotechnology, Trieste, Italy), Igor Zhulin (University of Tennessee, Oak Ridge National Laboratory, USA).

18.
Nucleic Acids Res ; 34(Web Server issue): W214-8, 2006 Jul 01.
Artigo em Inglês | MEDLINE | ID: mdl-16844996

RESUMO

DOUTfinder is a web-based tool facilitating protein domain detection among related protein sequences in the twilight zone of sequence similarity. The sequence set required for this analysis can be provided by the user or will be collected using PSI-BLAST if a single sequence is given as an input. The obtained sequence family is analyzed for known Pfam and SMART domains, and the thereby identified subsignificant domain similarities are evaluated further. Domains with several subthreshold hits in the query set are ranked based on a sum-score function and likely homologous domains are suggested according to established cut-offs. By providing a post-filtering procedure for subsignificant domain hits DOUTfinder allows the detection of non-trivial domain relationships and can thereby lead to new insights into the function and evolution of distantly related sequence families. DOUTfinder is available at http://mendel.imp.ac.at/dout/.


Assuntos
Estrutura Terciária de Proteína , Homologia de Sequência de Aminoácidos , Software , Humanos , Internet , Interface Usuário-Computador
19.
BMC Bioinformatics ; 7: 164, 2006 Mar 21.
Artigo em Inglês | MEDLINE | ID: mdl-16551354

RESUMO

BACKGROUND: Manually finding subtle yet statistically significant links to distantly related homologues becomes practically impossible for very populated protein families due to the sheer number of similarity searches to be invoked and analyzed. The unclear evolutionary relationship between classical mammalian lipases and the recently discovered human adipose triglyceride lipase (ATGL; a patatin family member) is an exemplary case for such a problem. RESULTS: We describe an unsupervised, sensitive sequence segment collection heuristic suitable for assembling very large protein families. It is based on fan-like expanding, iterative database searches. To prevent inclusion of unrelated hits, additional criteria are introduced: minimal alignment length and overlap with starting sequence segments, finding starting sequences in reciprocal searches, automated filtering for compositional bias and repetitive patterns. This heuristic was implemented as FAMILYSEARCHER in the ANNIE sequence analysis environment and applied to search for protein links between the classical lipase family and the patatin-like group. CONCLUSION: The FAMILYSEARCHER is an efficient tool for tracing distant evolutionary relationships involving large protein families. Although classical lipases and ATGL have no obvious sequence similarity and differ with regard to fold and catalytic mechanism, homology links detected with FAMILYSEARCHER show that they are evolutionarily related. The conserved sequence parts can be narrowed down to an ancestral core module consisting of three beta-strands, one alpha-helix and a turn containing the typical nucleophilic serine. Moreover, this ancestral module also appears in numerous enzymes with various substrate specificities, but that critically rely on nucleophilic attack mechanisms.


Assuntos
Tecido Adiposo/metabolismo , Algoritmos , Mapeamento Cromossômico/métodos , Evolução Molecular , Desequilíbrio de Ligação/genética , Lipase/genética , Alinhamento de Sequência/métodos , Análise de Sequência de DNA/métodos , Animais , Sequência Conservada , Humanos , Mamíferos , Homologia de Sequência do Ácido Nucleico
20.
IEEE Trans Syst Man Cybern B Cybern ; 35(3): 426-37, 2005 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-15971912

RESUMO

A major problem in designing artificial neural networks is the proper choice of the network architecture. Especially for vision networks classifying three-dimensional (3-D) objects this problem is very challenging, as these networks are necessarily large and therefore the search space for defining the needed networks is of a very high dimensionality. This strongly increases the chances of obtaining only suboptimal structures from standard optimization algorithms. We tackle this problem in two ways. First, we use biologically inspired hierarchical vision models to narrow the space of possible architectures and to reduce the dimensionality of the search space. Second, we employ evolutionary optimization techniques to determine optimal features and nonlinearities of the visual hierarchy. Here, we especially focus on higher order complex features in higher hierarchical stages. We compare two different approaches to perform an evolutionary optimization of these features. In the first setting, we directly code the features into the genome. In the second setting, in analogy to an ontogenetical development process, we suggest the new method of an indirect coding of the features via an unsupervised learning process, which is embedded into the evolutionary optimization. In both cases the processing nonlinearities are encoded directly into the genome and are thus subject to optimization. The fitness of the individuals for the evolutionary selection process is computed by measuring the network classification performance on a benchmark image database. Here, we use a nearest-neighbor classification approach, based on the hierarchical feature output. We compare the found solutions with respect to their ability to generalize. We differentiate between a first- and a second-order generalization. The first-order generalization denotes how well the vision system, after evolutionary optimization of the features and nonlinearities using a database A, can classify previously unseen test views of objects from this database A. As second-order generalization, we denote the ability of the vision system to perform classification on a database B using the features and nonlinearities optimized on database A. We show that the direct feature coding approach leads to networks with a better first-order generalization, whereas the second-order generalization is on an equally high level for both direct and indirect coding. We also compare the second-order generalization results with other state-of-the-art recognition systems and show that both approaches lead to optimized recognition systems, which are highly competitive with recent recognition algorithms.


Assuntos
Algoritmos , Inteligência Artificial , Aumento da Imagem/métodos , Interpretação de Imagem Assistida por Computador/métodos , Armazenamento e Recuperação da Informação/métodos , Reconhecimento Automatizado de Padrão/métodos , Análise por Conglomerados , Simulação por Computador , Modelos Biológicos , Modelos Estatísticos , Reprodutibilidade dos Testes , Sensibilidade e Especificidade , Processamento de Sinais Assistido por Computador
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...