Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 3 de 3
Filter
Add more filters










Database
Language
Publication year range
1.
BMC Bioinformatics ; 9: 353, 2008 Aug 27.
Article in English | MEDLINE | ID: mdl-18752676

ABSTRACT

BACKGROUND: Despite significant improvements in computational annotation of genomes, sequences of abnormal, incomplete or incorrectly predicted genes and proteins remain abundant in public databases. Since the majority of incomplete, abnormal or mispredicted entries are not annotated as such, these errors seriously affect the reliability of these databases. Here we describe the MisPred approach that may provide an efficient means for the quality control of databases. The current version of the MisPred approach uses five distinct routines for identifying abnormal, incomplete or mispredicted entries based on the principle that a sequence is likely to be incorrect if some of its features conflict with our current knowledge about protein-coding genes and proteins: (i) conflict between the predicted subcellular localization of proteins and the absence of the corresponding sequence signals; (ii) presence of extracellular and cytoplasmic domains and the absence of transmembrane segments; (iii) co-occurrence of extracellular and nuclear domains; (iv) violation of domain integrity; (v) chimeras encoded by two or more genes located on different chromosomes. RESULTS: Analyses of predicted EnsEMBL protein sequences of nine deuterostome (Homo sapiens, Mus musculus, Rattus norvegicus, Monodelphis domestica, Gallus gallus, Xenopus tropicalis, Fugu rubripes, Danio rerio and Ciona intestinalis) and two protostome species (Caenorhabditis elegans and Drosophila melanogaster) have revealed that the absence of expected signal peptides and violation of domain integrity account for the majority of mispredictions. Analyses of sequences predicted by NCBI's GNOMON annotation pipeline show that the rates of mispredictions are comparable to those of EnsEMBL. Interestingly, even the manually curated UniProtKB/Swiss-Prot dataset is contaminated with mispredicted or abnormal proteins, although to a much lesser extent than UniProtKB/TrEMBL or the EnsEMBL or GNOMON-predicted entries. CONCLUSION: MisPred works efficiently in identifying errors in predictions generated by the most reliable gene prediction tools such as the EnsEMBL and NCBI's GNOMON pipelines and also guides the correction of errors. We suggest that application of the MisPred approach will significantly improve the quality of gene predictions and the associated databases.


Subject(s)
Database Management Systems , Databases, Protein , Information Storage and Retrieval/methods , Internet , Natural Language Processing , Proteins/classification , Terminology as Topic , Artifacts , Proteins/chemistry , Proteins/metabolism , Quality Control , Sequence Analysis, Protein/methods
2.
Chem Biol Interact ; 143-144: 289-97, 2003 Feb 01.
Article in English | MEDLINE | ID: mdl-12604215

ABSTRACT

The structure of the rat liver aflatoxin dialdehyde reductase (AKR7A1) has been solved to 1.38 A resolution. The crystal structure reveals details of the ternary complex as one subunit of the dimer contains NADP+ and the inhibitor citrate. The underlying catalytic mechanism appears similar to other aldo-keto reductases (AKR), whilst the substrate-binding pocket contains several positively charged amino acids (Arg-231 and Arg-327) which distinguishes it from the well characterised AKR1 family of enzymes. These differences account for the substrate specificity for 4-carbon acid-aldehydes such as succinic semialdehyde (SSA) and 2-carboxybenzaldehyde, as well as for the idiosyncratic substrate aflatoxin B1 dialdehyde of this subfamily of enzymes. The AKR7 enzymes seem to be subdivided into two subgroups based on their sequence and kinetic properties. Modelling of the rat AKR7A4 highlights important structural differences localised within the active site of the two isoenzymes.


Subject(s)
Aldehyde Reductase/chemistry , Liver/enzymology , Aldehyde Reductase/metabolism , Amino Acid Sequence , Animals , Catalysis , Crystallography, X-Ray , Models, Molecular , Molecular Sequence Data , Protein Conformation , Rats , Sequence Homology, Amino Acid , Substrate Specificity
3.
J Biol Chem ; 277(18): 16285-93, 2002 May 03.
Article in English | MEDLINE | ID: mdl-11839745

ABSTRACT

The structure of the rat liver aflatoxin dialdehyde reductase (AKR7A1) has been solved to 1.38-A resolution. Although it shares a similar alpha/beta-barrel structure with other members of the aldo-keto reductase superfamily, AKR7A1 is the first dimeric member to be crystallized. The crystal structure also reveals details of the ternary complex as one subunit of the dimer contains NADP(+) and the inhibitor citrate. Although the underlying catalytic mechanism appears similar to other aldo-keto reductases, the substrate-binding pocket contains several charged amino acids (Arg-231 and Arg-327) that distinguish it from previously characterized aldo-keto reductases with respect to size and charge. These differences account for the substrate specificity for 4-carbon acid-aldehydes such as succinic semialdehyde and 2-carboxybenzaldehyde as well as for the idiosyncratic substrate aflatoxin B(1) dialdehyde of this subfamily of enzymes. Structural differences between the AKR7A1 ternary complex and apoenzyme reveal a significant hinged movement of the enzyme involving not only the loops of the structure but also parts of the alpha/beta-barrel most intimately involved in cofactor binding.


Subject(s)
Aldehyde Reductase/chemistry , Liver/enzymology , Aldehyde Reductase/metabolism , Amino Acid Sequence , Animals , Binding Sites , Catalysis , Crystallography, X-Ray , Dimerization , Models, Molecular , Protein Conformation , Protein Structure, Secondary , Protein Subunits , Rats , Recombinant Proteins/chemistry , Recombinant Proteins/metabolism , Substrate Specificity
SELECTION OF CITATIONS
SEARCH DETAIL
...