Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 10 de 10
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
Complex Intell Systems ; 8(2): 1561-1577, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-35535331

RESUMO

Graph-based algorithms are known to be effective approaches to semi-supervised learning. However, there has been relatively little work on extending these algorithms to the multi-label classification case. We derive an extension of the Manifold Regularization algorithm to multi-label classification, which is significantly simpler than the general Vector Manifold Regularization approach. We then augment our algorithm with a weighting strategy to allow differential influence on a model between instances having ground-truth vs. induced labels. Experiments on four benchmark multi-label data sets show that the resulting algorithm performs better overall compared to the existing semi-supervised multi-label classification algorithms at various levels of label sparsity. Comparisons with state-of-the-art supervised multi-label approaches (which of course are fully labeled) also show that our algorithm outperforms all of them even with a substantial number of unlabeled examples.

2.
J Bioinform Comput Biol ; 9(1): 67-89, 2011 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-21328707

RESUMO

Accurate identification of strand residues aids prediction and analysis of numerous structural and functional aspects of proteins. We propose a sequence-based predictor, BETArPRED, which improves prediction of strand residues and ß-strand segments. BETArPRED uses a novel design that accepts strand residues predicted by SSpro and predicts the remaining positions utilizing a logistic regression classifier with nine custom-designed features. These are derived from the primary sequence, the secondary structure (SS) predicted by SSpro, PSIPRED and SPINE, and residue depth as predicted by RDpred. Our features utilize certain local (window-based) patterns in the predicted SS and combine information about the predicted SS and residue depth. BETArPRED is evaluated on 432 sequences that share low identity with the training chains, and on the CASP8 dataset. We compare BETArPRED with seven modern SS predictors, and the top-performing automated structure predictor in CASP8, the ZHANG-server. BETArPRED provides statistically significant improvements over each of the SS predictors; it improves prediction of strand residues and ß-strands, and it finds ß-strands that were missed by the other methods. When compared with the ZHANG-server, we improve predictions of strand segments and predict more actual strand residues, while the other predictor achieves higher rate of correct strand residue predictions when under-predicting them.


Assuntos
Proteínas/química , Sequência de Aminoácidos , Inteligência Artificial , Carboidratos Epimerases/química , Carboidratos Epimerases/genética , Caspase 8/química , Caspase 8/genética , Biologia Computacional , Simulação por Computador , Bases de Dados de Proteínas , Modelos Logísticos , Modelos Moleculares , Estrutura Secundária de Proteína , Proteínas/genética , Software
3.
BMC Struct Biol ; 9: 50, 2009 Jul 31.
Artigo em Inglês | MEDLINE | ID: mdl-19646256

RESUMO

BACKGROUND: Current protocols yield crystals for <30% of known proteins, indicating that automatically identifying crystallizable proteins may improve high-throughput structural genomics efforts. We introduce CRYSTALP2, a kernel-based method that predicts the propensity of a given protein sequence to produce diffraction-quality crystals. This method utilizes the composition and collocation of amino acids, isoelectric point, and hydrophobicity, as estimated from the primary sequence, to generate predictions. CRYSTALP2 extends its predecessor, CRYSTALP, by enabling predictions for sequences of unrestricted size and provides improved prediction quality. RESULTS: A significant majority of the collocations used by CRYSTALP2 include residues with high conformational entropy, or low entropy and high potential to mediate crystal contacts; notably, such residues are utilized by surface entropy reduction methods. We show that the collocations provide complementary information to the hydrophobicity and isoelectric point. Tests on four datasets show that CRYSTALP2 outperforms several existing sequence-based predictors (CRYSTALP, OB-score, and SECRET). CRYSTALP2's accuracy, MCC, and AROC range between 69.3 and 77.5%, 0.39 and 0.55, and 0.72 and 0.79, respectively. Our predictions are similar in quality and are complementary to the predictions of the most recent ParCrys and XtalPred methods. Our results also suggest that, as work in protein crystallization continues (thereby enlarging the population of proteins with known crystallization propensities), the prediction quality of the CRYSTALP2 method should increase. The prediction model and the datasets used in this contribution can be downloaded from http://biomine.ece.ualberta.ca/CRYSTALP2/CRYSTALP2.html. CONCLUSION: CRYSTALP2 provides relatively accurate crystallization propensity predictions for a given protein chain that either outperform or complement the existing approaches. The proposed method can be used to support current efforts towards improving the success rate in obtaining diffraction-quality crystals.


Assuntos
Cristalização , Proteínas/química , Análise de Sequência de Proteína/métodos , Inteligência Artificial , Bases de Dados de Proteínas , Software
4.
Bioinform Biol Insights ; 2: 133-44, 2008 Mar 12.
Artigo em Inglês | MEDLINE | ID: mdl-19812771

RESUMO

The exact mechanisms of prion misfolding and factors that predispose an individual to prion diseases are largely unknown. Our approach to identifying candidate factors in-silico relies on contrasting the C-terminal domain of PrP(C) sequences from two groups of vertebrate species: those that have been found to suffer from prion diseases, and those that have not. We propose that any significant differences between the two groups are candidate factors that may predispose individuals to develop prion disease, which should be further analyzed by wet-lab investigations. Using an array of computational methods we identified possible point mutations that could predispose PrP(C) to misfold into PrP(Sc). Our results include confirmatory findings such as the V210I mutation, and new findings including P137M, G142D, G142N, D144P, K185T, V189I, H187Y and T191P mutations, which could impact structural stability. We also propose new hypotheses that give insights into the stability of helix-2 and -3. These include destabilizing effects of Histidine and T188-T193 segment in helix-2 in the disease-prone prions, and a stabilizing effect of Leucine on helix-3 in the disease-resistant prions.

5.
Biochem Biophys Res Commun ; 348(3): 981-8, 2006 Sep 29.
Artigo em Inglês | MEDLINE | ID: mdl-16904630

RESUMO

Structural class characterizes the overall folding type of a protein or its domain. A number of computational methods have been proposed to predict structural class based on primary sequences; however, the accuracy of these methods is strongly affected by sequence homology. This paper proposes, an ensemble classification method and a compact feature-based sequence representation. This method improves prediction accuracy for the four main structural classes compared to competing methods, and provides highly accurate predictions for sequences of widely varying homologies. The experimental evaluation of the proposed method shows superior results across sequences that are characterized by entire homology spectrum, ranging from 25% to 90% homology. The error rates were reduced by over 20% when compared with using individual prediction methods and most commonly used composition vector representation of protein sequences. Comparisons with competing methods on three large benchmark datasets consistently show the superiority of the proposed method.


Assuntos
Sequência de Aminoácidos , Homologia de Sequência de Aminoácidos , Homologia Estrutural de Proteína , Algoritmos , Biologia Computacional , Valor Preditivo dos Testes
6.
Comput Biol Chem ; 30(5): 393-4, 2006 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-16872905

RESUMO

Protein structural class describes the overall folding type of a protein or its domain. A number of methods were developed to predict protein structural class based on its primary sequence. The homology of the predicted sequences with respect to the training sequences is a key attribute for the prediction performance. In this article we investigated the FDOD method developed by Jin et al. [Jin, L., Fang, W., Tang, H., 2003. Prediction of protein structural classes by a new measure of information discrepancy. Comput. Biol. Chem. 27, 373-380], which gave high prediction accuracy on a low homology dataset, and we empirically confirmed that the reported results were an artifact of improper implementation.


Assuntos
Simulação por Computador , Proteínas/química , Bases de Dados de Proteínas , Dobramento de Proteína , Proteínas/classificação
7.
IEEE Trans Syst Man Cybern B Cybern ; 36(1): 32-53, 2006 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-16468565

RESUMO

Business intelligence and bioinformatics applications increasingly require the mining of datasets consisting of millions of data points, or crafting real-time enterprise-level decision support systems for large corporations and drug companies. In all cases, there needs to be an underlying data mining system, and this mining system must be highly scalable. To this end, we describe a new rule learner called DataSqueezer. The learner belongs to the family of inductive supervised rule extraction algorithms. DataSqueezer is a simple, greedy, rule builder that generates a set of production rules from labeled input data. In spite of its relative simplicity, DataSqueezer is a very effective learner. The rules generated by the algorithm are compact, comprehensible, and have accuracy comparable to rules generated by other state-of-the-art rule extraction algorithms. The main advantages of DataSqueezer are very high efficiency, and missing data resistance. DataSqueezer exhibits log-linear asymptotic complexity with the number of training examples, and it is faster than other state-of-the-art rule learners. The learner is also robust to large quantities of missing data, as verified by extensive experimental comparison with the other learners. DataSqueezer is thus well suited to modern data mining and business intelligence tasks, which commonly involve huge datasets with a large fraction of missing data.


Assuntos
Algoritmos , Inteligência Artificial , Sistemas de Gerenciamento de Base de Dados , Bases de Dados Factuais , Técnicas de Apoio para a Decisão , Armazenamento e Recuperação da Informação/métodos
8.
J Biol Chem ; 281(2): 723-32, 2006 Jan 13.
Artigo em Inglês | MEDLINE | ID: mdl-16286454

RESUMO

Helicobacter pylori and Campylobacter jejuni have been shown to modify their flagellins with pseudaminic acid (Pse), via O-linkage, while C. jejuni also possesses a general protein glycosylation pathway (Pgl) responsible for the N-linked modification of at least 30 proteins with a heptasaccharide containing 2,4-diacetamido-2,4,6-trideoxy-alpha-D-glucopyranose, a derivative of bacillosamine. To further define the Pse and bacillosamine biosynthetic pathways, we have undertaken functional characterization of UDP-alpha-D-GlcNAc modifying dehydratase/aminotransferase pairs, in particular the H. pylori and C. jejuni flagellar pairs HP0840/HP0366 and Cj1293/Cj1294, as well as the C. jejuni Pgl pair Cj1120c/Cj1121c using His(6)-tagged purified derivatives. The metabolites produced by these enzymes were identified using NMR spectroscopy at 500 and/or 600 MHz with a cryogenically cooled probe for optimal sensitivity. The metabolites of Cj1293 (PseB) and HP0840 (FlaA1) were found to be labile and could only be characterized by NMR analysis directly in aqueous reaction buffer. The Cj1293 and HP0840 enzymes exhibited C6 dehydratase as well as a newly identified C5 epimerase activity that resulted in the production of both UDP-2-acetamido-2,6-dideoxy-beta-L-arabino-4-hexulose and UDP-2-acetamido-2,6-dideoxy-alpha-D-xylo-4-hexulose. In contrast, the Pgl dehydratase Cj1120c (PglF) was found to possess only C6 dehydratase activity generating UDP-2-acetamido-2,6-dideoxy-alpha-D-xylo-4-hexulose. Substrate-specificity studies demonstrated that the flagellar aminotransferases HP0366 and Cj1294 utilize only UDP-2-acetamido-2,6-dideoxy-beta-L-arabino-4-hexulose as substrate producing UDP-4-amino-4,6-dideoxy-beta-L-AltNAc, a precursor in the Pse biosynthetic pathway. In contrast, the Pgl aminotransferase Cj1121c (PglE) utilizes only UDP-2-acetamido-2,6-dideoxy-alpha-D-xylo-4-hexulose producing UDP-4-amino-4,6-dideoxy-alpha-D-GlcNAc (UDP-2-acetamido-4-amino-2,4,6-trideoxy-alpha-D-glucopyranose), a precursor used in the production of the Pgl glycan component 2,4-diacetamido-2,4,6-trideoxy-alpha-D-glucopyranose.


Assuntos
Campylobacter jejuni/enzimologia , Helicobacter pylori/enzimologia , Hexosaminas/química , Hidroliases/química , Transaminases/química , Sequência de Carboidratos , DNA/metabolismo , Relação Dose-Resposta a Droga , Eletroforese Capilar , Eletroforese em Gel de Poliacrilamida , Hexosaminas/metabolismo , Histidina/química , Cinética , Espectroscopia de Ressonância Magnética , Modelos Químicos , Dados de Sequência Molecular , Mutação , Oligonucleotídeos/química , Plasmídeos/metabolismo , Especificidade por Substrato , Fatores de Tempo , Uridina Difosfato N-Acetilglicosamina/química
9.
J Biol Chem ; 280(43): 35922-8, 2005 Oct 28.
Artigo em Inglês | MEDLINE | ID: mdl-16120604

RESUMO

Campylobacter jejuni and Campylobacter coli are the main causes of bacterial diarrhea worldwide, and Helicobacter pylori is known to cause duodenal ulcers. In all of these pathogenic organisms, the flagellin proteins are heavily glycosylated with a 2-keto-3-deoxy acid, pseudaminic acid (5,7-diacetamido-3,5,7,9-tetradeoxy-L-glycero-L-manno-nonulosonic acid). The presence of pseudaminic acid is required for the proper development of the flagella and is thereby necessary for motility in, and invasion of, the host. In this study we report the first characterization of NeuB3 from C. jejuni as a pseudaminic acid synthase; the enzyme directly responsible for the biosynthesis of pseudaminic acid. Pseudaminic acid synthase catalyzes the condensation of phosphoenolpyruvate (PEP) with the hexose, 2,4-diacetamido-2,4,6-trideoxy-L-altrose (6-deoxy-AltdiNAc), to form pseudaminic acid and phosphate. The enzymatic activity was monitored using 1H and 31P NMR spectroscopy, and the product was isolated and characterized. Kinetic analysis reveals that pseudaminic acid synthase requires the presence of a divalent metal ion for catalysis and that optimal catalysis occurs at pH 7.0. A coupled enzymatic assay gave the values for k(cat) of 0.65 +/- 0.01 s(-1), K(m)PEP of 6.5 +/- 0.4 microM, and K(m)6-deoxy-AltdiNAc of 9.5 +/- 0.7 microM. A mechanistic study on pseudaminic acid synthase, using [2-18O]PEP, shows that catalysis proceeds through a C-O bond cleavage mechanism similar to other PEP condensing synthases such as sialic acid synthase.


Assuntos
Campylobacter jejuni/metabolismo , N-Acilneuraminato Citidililtransferase/química , Campylobacter coli/metabolismo , Catálise , Clonagem Molecular , Cobalto/química , Relação Dose-Resposta a Droga , Eletroforese em Gel de Poliacrilamida , Concentração de Íons de Hidrogênio , Cinética , Espectroscopia de Ressonância Magnética , Modelos Químicos , N-Acilneuraminato Citidililtransferase/fisiologia , Oxo-Ácido-Liases/química , Fosfatos/química , Fosfoenolpiruvato/química
10.
Biochim Biophys Acta ; 1594(2): 219-33, 2002 Feb 11.
Artigo em Inglês | MEDLINE | ID: mdl-11904218

RESUMO

Recombinant lysine:N(6)-hydroxylase, rIucD, catalyzes the hydroxylation of L-lysine to its N(6)-hydroxy derivative, with NADPH and FAD serving as cofactors in the reaction. The five cysteine residues present in rIucD can be replaced, individually or in combination, with alanine without effecting a major change in the thermal stability, the affinity for L-lysine and FAD, as well as the k(cat) for mono-oxygenase activity of the protein. However, when the susceptibility to modification by either 5,5'-dithiobis(2-nitrobenzoic acid) (DTNB) or 2,6-dichlorophenol indophenol (DPIP) serves as the criterion for monitoring conformational change(s) in rIucD and its muteins, Cys146-->Ala and Cys166-->Ala substitutions are found to induce an enhancement in the reactivity of one of the protein's remaining cysteine residues with concomitant diminution of mono-oxygenase function. In addition, the systematic study of cysteine-->alanine replacement has led to the identification of rIucD's Cys166 as the exposed residue which is detectable during the reaction of the protein with DTNB but not with iodoacetate. Substitution of Cys51 of rIucD with alanine results in an increase in mono-oxygenase activity (approx. 2-fold). Such replacement, unlike those of other cysteine residues, also enables the covalent DPIP conjugate of the protein to accommodate FAD in its catalytic function. A possible role of rIucD's Cys51 in the modulation of its mono-oxygenase function is discussed.


Assuntos
Ácido Iopanoico/análogos & derivados , Oxigenases de Função Mista/química , Alanina/química , Varredura Diferencial de Calorimetria , Cisteína/química , Ácido Ditionitrobenzoico/química , Estabilidade Enzimática , Flavina-Adenina Dinucleotídeo/química , Temperatura Alta , Ácido Iopanoico/química , Cinética , Oxigenases de Função Mista/genética , Modelos Químicos , Mutagênese Sítio-Dirigida , Oxirredução , Conformação Proteica , Proteínas Recombinantes/química
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...