Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 33
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
J Adv Res ; 2024 Jun 09.
Artigo em Inglês | MEDLINE | ID: mdl-38862035

RESUMO

INTRODUCTION: Frailty Index (FI) is a common measure of frailty, which has been advocated as a routine clinical test by many guidelines. The genetic and phenotypic relationships of FI with cardiovascular indicators (CIs) and behavioral characteristics (BCs) are unclear, which has hampered ability to monitor FI using easily collected data. OBJECTIVES: This study is designed to investigate the genetic and phenotypic associations of frailty with CIs and BCs, and further to construct a model to predict FI. METHOD: Genetic relationships of FI with 288 CIs and 90 BCs were assessed by the cross-trait LD score regression (LDSC) and Mendelian randomization (MR). The phenotypic data of these CIs and BCs were integrated with a machine-learning model to predict FI of individuals in UK-biobank. The relationships of the predicted FI with risks of type 2 diabetes (T2D) and neurodegenerative diseases were tested by the Kaplan-Meier estimator and Cox proportional hazards model. RESULTS: MR revealed putative causal effects of seven CIs and eight BCs on FI. These CIs and BCs were integrated to establish a model for predicting FI. The predicted FI is significantly correlated with the observed FI (Pearson correlation coefficient = 0.660, P-value = 4.96 × 10-62). The prediction model indicated "usual walking pace" contributes the most to prediction. Patients who were predicted with high FI are in significantly higher risk of T2D (HR = 2.635, P < 2 × 10-16) and neurodegenerative diseases (HR = 2.307, P = 1.62 × 10-3) than other patients. CONCLUSION: This study supports associations of FI with CIs and BCs from genetic and phenotypic perspectives. The model that is developed by integrating easily collected CIs and BCs data in predicting FI has the potential to monitor disease risk.

2.
Molecules ; 29(4)2024 Feb 13.
Artigo em Inglês | MEDLINE | ID: mdl-38398585

RESUMO

The prediction of three-dimensional (3D) protein structure from amino acid sequences has stood as a significant challenge in computational and structural bioinformatics for decades. Recently, the widespread integration of artificial intelligence (AI) algorithms has substantially expedited advancements in protein structure prediction, yielding numerous significant milestones. In particular, the end-to-end deep learning method AlphaFold2 has facilitated the rise of structure prediction performance to new heights, regularly competitive with experimental structures in the 14th Critical Assessment of Protein Structure Prediction (CASP14). To provide a comprehensive understanding and guide future research in the field of protein structure prediction for researchers, this review describes various methodologies, assessments, and databases in protein structure prediction, including traditionally used protein structure prediction methods, such as template-based modeling (TBM) and template-free modeling (FM) approaches; recently developed deep learning-based methods, such as contact/distance-guided methods, end-to-end folding methods, and protein language model (PLM)-based methods; multi-domain protein structure prediction methods; the CASP experiments and related assessments; and the recently released AlphaFold Protein Structure Database (AlphaFold DB). We discuss their advantages, disadvantages, and application scopes, aiming to provide researchers with insights through which to understand the limitations, contexts, and effective selections of protein structure prediction methods in protein-related fields.


Assuntos
Inteligência Artificial , Proteínas , Conformação Proteica , Modelos Moleculares , Proteínas/química , Algoritmos , Biologia Computacional/métodos , Bases de Dados de Proteínas , Software , Dobramento de Proteína
3.
Bioinformatics ; 39(4)2023 04 03.
Artigo em Inglês | MEDLINE | ID: mdl-37039829

RESUMO

MOTIVATION: Identifying the B-cell epitopes is an essential step for guiding rational vaccine development and immunotherapies. Since experimental approaches are expensive and time-consuming, many computational methods have been designed to assist B-cell epitope prediction. However, existing sequence-based methods have limited performance since they only use contextual features of the sequential neighbors while neglecting structural information. RESULTS: Based on the recent breakthrough of AlphaFold2 in protein structure prediction, we propose GraphBepi, a novel graph-based model for accurate B-cell epitope prediction. For one protein, the predicted structure from AlphaFold2 is used to construct the protein graph, where the nodes/residues are encoded by ESM-2 learning representations. The graph is input into the edge-enhanced deep graph neural network (EGNN) to capture the spatial information in the predicted 3D structures. In parallel, a bidirectional long short-term memory neural networks (BiLSTM) are employed to capture long-range dependencies in the sequence. The learned low-dimensional representations by EGNN and BiLSTM are then combined into a multilayer perceptron for predicting B-cell epitopes. Through comprehensive tests on the curated epitope dataset, GraphBepi was shown to outperform the state-of-the-art methods by more than 5.5% and 44.0% in terms of AUC and AUPR, respectively. A web server is freely available at http://bio-web1.nscc-gz.cn/app/graphbepi. AVAILABILITY AND IMPLEMENTATION: The datasets, pre-computed features, source codes, and the trained model are available at https://github.com/biomed-AI/GraphBepi.


Assuntos
Epitopos de Linfócito B , Redes Neurais de Computação , Epitopos de Linfócito B/química , Proteínas/química , Software , Idioma
4.
ACS Appl Mater Interfaces ; 15(6): 8783-8793, 2023 Feb 15.
Artigo em Inglês | MEDLINE | ID: mdl-36723501

RESUMO

Wearable, noninvasive, and simultaneous sensing of subtle strains and eccrine molecules on human body is essential for future health monitoring and personalized medicine. However, there is a huge chasm between biomechanics and bio/chemical molecule detections. Here, a wearable plasmonic bridge sensor with multiple abilities to monitor subtle strains and molecules is developed. Hollow Au-Ag nano-rambutans and carbon nanotubes (CNTs) are adsorbed in the nonwoven fabrics (NWFs) conjointly, where the gap between the conducting network of CNTs is bridged by the Au-Ag nano-rambutans during the subtle strain sensing, and the detection sensitivity for stress is improved at least 1 order of magnitude compared to that with the only CNTs. In order to acquire the accurate human action recognition, a machine learning algorithm (support vector machines) based on output biomechanics data is designed. The average accuracy of our plasmonic bridge sensor reaches 89.0% for human action recognition. Moreover, due to the hollow structure and high nanoroughness, the single Au-Ag nano-rambutan particle has strong localized surface plasmon resonance effect and high surface-enhanced Raman scattering (SERS) activity. Based on their unique SERS spectra introduced by the hollow Au-Ag nano-rambutan adsorbed in the NWFs, noninvasive extraction and "fingerprint" recognition of bio/chemical molecules could be realized during the wearable sensing. In sum, the NWFs/CNTs/Au-Ag sensor bridges the barrier between the bodily strain detection and molecule recognition during the wearable sensing. Such integrated and multifunctional sensing strategy for universal biomechanics and bio/chemical molecules means to assess human health to be of importance.


Assuntos
Nanopartículas Metálicas , Nanotubos de Carbono , Humanos , Fenômenos Biomecânicos , Ouro/química , Nanopartículas Metálicas/química , Prata/química , Análise Espectral Raman
5.
Leuk Res ; 117: 106843, 2022 06.
Artigo em Inglês | MEDLINE | ID: mdl-35512442

RESUMO

Little is known regarding whether the cell of origin differs among different leukemia types. To address this fundamental issue, we determined the cell of origin in five distinct types of acute leukemia induced by N-Myc overexpression in mice. CD150+CD48-CD41-CD34-c-Kit+Sca-1+Lin- (KSL) (HSC1) cells, CD150-CD48-CD41-CD34-KSL (HSC2) cells, CD150+CD41+CD34-KSL (HPC1) cells, CD150+CD41+CD34+KSL (HPC2) cells, and CD150-CD41-CD34+KSL (HPC3) cells were purified from the bone marrow of adult C57BL/6 mice, transduced with the N-Myc retrovirus vector, and transplanted into lethally irradiated mice. B-cell acute lymphoblastic leukemia (B-ALL), T-cell acute lymphoblastic leukemia (T-ALL), acute myeloid leukemia (AML), acute undifferentiated leukemia (AUL), and mixed phenotype acute leukemia (MPAL) developed from five populations. RNA sequencing data supported the phenotypical diagnoses of leukemia, except that AUL appeared transcriptionally close to T-ALL. Whole-genome sequencing revealed that retroviral integration sites were irrelevant to the leukemia types and that T-ALL and AML of MPAL shared the same integration site and many gene mutations, suggesting their common origin. Additionally, leukemic stem cells were identified in the KSL cell population, suggesting that the phenotypes of leukemic stem cells are irrelevant to leukemia types. This study provides experimental evidence for the similar and multiple cells of origin in acute leukemia.


Assuntos
Leucemia Mieloide Aguda , Leucemia-Linfoma Linfoblástico de Células Precursoras , Leucemia-Linfoma Linfoblástico de Células T Precursoras , Animais , Antígenos CD34 , Humanos , Leucemia Mieloide Aguda/genética , Camundongos , Camundongos Endogâmicos C57BL , Leucemia-Linfoma Linfoblástico de Células Precursoras/genética
6.
IEEE/ACM Trans Comput Biol Bioinform ; 19(6): 3255-3262, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-34529570

RESUMO

One important task in single-cell analysis is to quantify the differentiation potential of single cells. Though various single-cell potency measures have been proposed, they are based on individual biological sources, thus not robust and reliable. It is still a challenge to combine multiple sources to generate a relatively reliable and robust measure to estimate differentiation. In this paper, we propose a New Centrality measure with Gene ontology information (NCG) to estimate single-cell potency. NCG is designed by combining network topology property with edge clustering coefficient, and gene function information using gene ontology function similarity scores. NCG distinguishes pluripotent cells from non-pluripotent cells with high accuracy, correctly ranks different cell types by their differentiation potency, tracks changes during the differentiation process, and constructs the lineage trajectory from human myoblasts into skeletal muscle cells. These indicate that NCG is a reliable and robust measure to estimate single-cell potency. NCG is anticipated to be a useful tool for identifying novel stem or progenitor cell phenotypes from single-cell RNA-Seq data. The source codes and datasets are available at https://github.com/Xinzhe-Ni/NCG.


Assuntos
Algoritmos , Software , Humanos , Ontologia Genética , Diferenciação Celular/genética , Análise de Célula Única , Perfilação da Expressão Gênica , Análise de Sequência de RNA , Análise por Conglomerados
7.
Bioinformatics ; 37(21): 3752-3759, 2021 11 05.
Artigo em Inglês | MEDLINE | ID: mdl-34473228

RESUMO

MOTIVATION: Protein model quality assessment (QA) is an essential component in protein structure prediction, which aims to estimate the quality of a structure model and/or select the most accurate model out from a pool of structure models, without knowing the native structure. QA remains a challenging task in protein structure prediction. RESULTS: Based on the inter-residue distance predicted by the recent deep learning-based structure prediction algorithm trRosetta, we developed QDistance, a new approach to the estimation of both global and local qualities. QDistance works for both single- and multi-models inputs. We designed several distance-based features to assess the agreement between the predicted and model-derived inter-residue distances. Together with a few widely used features, they are fed into a simple yet powerful linear regression model to infer the global QA scores. The local QA scores for each structure model are predicted based on a comparative analysis with a set of selected reference models. For multi-models input, the reference models are selected from the input based on the predicted global QA scores. For single-model input, the reference models are predicted by trRosetta. With the informative distance-based features, QDistance can predict the global quality with satisfactory accuracy. Benchmark tests on the CASP13 and the CAMEO structure models suggested that QDistance was competitive with other methods. Blind tests in the CASP14 experiments showed that QDistance was robust and ranked among the top predictors. Especially, QDistance was the top 3 local QA method and made the most accurate local QA prediction for unreliable local region. Analysis showed that this superior performance can be attributed to the inclusion of the predicted inter-residue distance. AVAILABILITY AND IMPLEMENTATION: http://yanglab.nankai.edu.cn/QDistance. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Biologia Computacional , Proteínas , Biologia Computacional/métodos , Proteínas/química , Algoritmos
8.
Bioinformatics ; 38(1): 94-98, 2021 12 22.
Artigo em Inglês | MEDLINE | ID: mdl-34450651

RESUMO

MOTIVATION: The solvent accessible surface is an essential structural property measure related to the protein structure and protein function. Relative solvent accessible area (RSA) is a standard measure to describe the degree of residue exposure in the protein surface or inside of protein. However, this computation will fail when the residues information is missing. RESULTS: In this article, we proposed a novel method for estimation RSA using the Cα atom distance matrix with the deep learning method (EAGERER). The new method, EAGERER, achieves Pearson correlation coefficients of 0.921-0.928 on two independent test datasets. We empirically demonstrate that EAGERER can yield better Pearson correlation coefficients than existing RSA estimators, such as coordination number, half sphere exposure and SphereCon. To the best of our knowledge, EAGERER represents the first method to estimate the solvent accessible area using limited information with a deep learning model. It could be useful to the protein structure and protein function prediction. AVAILABILITYAND IMPLEMENTATION: The method is free available at https://github.com/cliffgao/EAGERER. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Aprendizado Profundo , Proteínas de Membrana , Solventes/química
9.
Nat Commun ; 12(1): 4438, 2021 07 21.
Artigo em Inglês | MEDLINE | ID: mdl-34290238

RESUMO

Identification of intrinsic disorder in proteins relies in large part on computational predictors, which demands that their accuracy should be high. Since intrinsic disorder carries out a broad range of cellular functions, it is desirable to couple the disorder and disorder function predictions. We report a computational tool, flDPnn, that provides accurate, fast and comprehensive disorder and disorder function predictions from protein sequences. The recent Critical Assessment of protein Intrinsic Disorder prediction (CAID) experiment and results on other test datasets demonstrate that flDPnn offers accurate predictions of disorder, fully disordered proteins and four common disorder functions. These predictions are substantially better than the results of the existing disorder predictors and methods that predict functions of disorder. Ablation tests reveal that the high predictive performance stems from innovative ways used in flDPnn to derive sequence profiles and encode inputs. flDPnn's webserver is available at http://biomine.cs.vcu.edu/servers/flDPnn/.


Assuntos
Biologia Computacional/métodos , Proteínas Intrinsicamente Desordenadas/química , Proteínas Intrinsicamente Desordenadas/metabolismo , Aprendizado de Máquina , Ligação Proteica , Análise de Sequência de Proteína
10.
Mol Ther Nucleic Acids ; 24: 310-324, 2021 Jun 04.
Artigo em Inglês | MEDLINE | ID: mdl-33850635

RESUMO

Hypoxia induces a series of cellular adaptive responses that enable promotion of inflammation and cancer development. Hypoxia-inducible factor-1α (HIF-1α) is involved in the hypoxia response and cancer promotion, and it accumulates in hypoxia and is degraded under normoxic conditions. Here we identify prostate cancer associated transcript-1 (PCAT-1) as a hypoxia-inducible long non-coding RNA (lncRNA) that regulates HIF-1α stability, crucial for cancer progression. Extensive analyses of clinical data indicate that PCAT-1 is elevated in breast cancer patients and is associated with pathological grade, tumor size, and poor clinical outcomes. Through gain- and loss-of-function experiments, we find that PCAT-1 promotes hypoxia-associated breast cancer progression including growth, migration, invasion, colony formation, and metabolic regulation. Mechanistically, PCAT-1 directly interacts with the receptor of activated protein C kinase-1 (RACK1) protein and prevents RACK1 from binding to HIF-1α, thus protecting HIF-1α from RACK1-induced oxygen-independent degradation. These findings provide new insight into lncRNA-mediated mechanisms for HIF-1α stability and suggest a novel role of PCAT-1 as a potential therapeutic target for breast cancer.

11.
IEEE/ACM Trans Comput Biol Bioinform ; 18(5): 2017-2022, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-31794403

RESUMO

Structural flexibility plays an essential role in many biological processes. B-factor is an important indicator to measure the flexibility of protein or RNA structures. Many methods were developed to predict protein B-factors, but few studies have been done for RNA B-factor prediction. In this paper, we proposed a new method RNAbval to predict RNA B-factors using random forest. The method was developed using a comprehensive set of features, including the sequence profile and predicted solvent accessibility. RNAbval achieved an improvement of 9.2-20.5 percent over the state-of-the-art method on two benchmark test datasets. The proposed method is available at http://yanglab.nankai.edu.cn/RNAbval/.


Assuntos
Biologia Computacional/métodos , RNA , Análise de Sequência de RNA/métodos , Cristalografia por Raios X , Aprendizado de Máquina , Maleabilidade , Proteínas/química , Proteínas/metabolismo , RNA/química , RNA/genética , RNA/metabolismo , Solventes/química , Solventes/metabolismo
12.
Biomolecules ; 10(6)2020 06 07.
Artigo em Inglês | MEDLINE | ID: mdl-32517331

RESUMO

Computational prediction of ion channels facilitates the identification of putative ion channels from protein sequences. Several predictors of ion channels and their types were developed in the last quindecennial. While they offer reasonably accurate predictions, they also suffer a few shortcomings including lack of availability, parallel prediction mode, single-label prediction (inability to predict multiple channel subtypes), and incomplete scope (inability to predict subtypes of the voltage-gated channels). We developed a first-of-its-kind PSIONplusm method that performs sequential multi-label prediction of ion channels and their subtypes for both voltage-gated and ligand-gated channels. PSIONplusm sequentially combines the outputs produced by three support vector machine-based models from the PSIONplus predictor and is available as a webserver. Empirical tests show that PSIONplusm outperforms current methods for the multi-label prediction of the ion channel subtypes. This includes the existing single-label methods that are available to the users, a naïve multi-label predictor that combines results produced by multiple single-label methods, and methods that make predictions based on sequence alignment and domain annotations. We also found that the current methods (including PSIONplusm) fail to accurately predict a few of the least frequently occurring ion channel subtypes. Thus, new predictors should be developed when a larger quantity of annotated ion channels will be available to train predictive models.


Assuntos
Algoritmos , Biologia Computacional , Canais Iônicos/química , Software
13.
J Theor Biol ; 480: 274-283, 2019 11 07.
Artigo em Inglês | MEDLINE | ID: mdl-31251944

RESUMO

Many computational methods have been proposed to predict essential proteins from protein-protein interaction (PPI) networks. However, it is still challenging to improve the prediction accuracy. In this study, we propose a new method, esPOS (essential proteins Predictor using Order Statistics) to predict essential proteins from PPI networks. Firstly, we refine the networks by using gene expression information and subcellular localization information. Secondly, we design some new features, which combine the protein predicted secondary structure with PPI network. We show that these new features are useful to predict essential proteins. Thirdly, we optimize these features by using a greedy method, and combine the optimized features by order statistic method. Our method achieves the prediction accuracy of 0.76-0.79 on two network datasets. The proposed method is available at https://sourceforge.net/projects/espos/.


Assuntos
Algoritmos , Biologia Computacional/métodos , Mapas de Interação de Proteínas , Estatística como Assunto , Bases de Dados de Proteínas , Valor Preditivo dos Testes
14.
Chemosphere ; 228: 398-411, 2019 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-31048237

RESUMO

Endocrine disruptor chemicals induce adverse effects to animals' development, reproduction and behavior in environment. We investigated the effects of fluorene-9-bisphenol (BHPF), one substitute of bisphenol A, on courtship behavior and exploratory behavior of adult zebrafish. Customized apparatus was used to evaluate courtship behavior. The result showed that the male spent less time with BHPF and anti-oestrogenic fulvestrant (FULV) treated female in region of approaching (ROA). Courtship index between BHPF-exposed female and male decreased. The body orientation of BHPF- and FULV-exposed female to male decreased. Furthermore, BHPF exposure downregulated the expression of genes related to estrogen receptor, steroidogenesis and upregulated oxidative stress related genes. It indicated that BHPF exposure interfered the preference of male and female in courtship, and induced detrimental effects on reproduction. BHPF treatment decreased locomotor activity and time spent in top, increased freezing bouts, and induced anxiety/depression-like behavior. The tyrosine hydroxylase in brain decreased under BHPF exposure. Here we showed the potential adverse effects of BHPF on reproduction and exploratory behaviors.


Assuntos
Compostos Benzidrílicos/efeitos adversos , Comportamento Exploratório/efeitos dos fármacos , Fluorenos/química , Fenóis/efeitos adversos , Reprodução/efeitos dos fármacos , Animais , Compostos Benzidrílicos/química , Feminino , Fenóis/química , Peixe-Zebra
15.
Curr Drug Targets ; 20(5): 579-592, 2019.
Artigo em Inglês | MEDLINE | ID: mdl-30360734

RESUMO

BACKGROUND: Ion channels are a large and growing protein family. Many of them are associated with diseases, and consequently, they are targets for over 700 drugs. Discovery of new ion channels is facilitated with computational methods that predict ion channels and their types from protein sequences. However, these methods were never comprehensively compared and evaluated. OBJECTIVE: We offer first-of-its-kind comprehensive survey of the sequence-based predictors of ion channels. We describe eight predictors that include five methods that predict ion channels, their types, and four classes of the voltage-gated channels. We also develop and use a new benchmark dataset to perform comparative empirical analysis of the three currently available predictors. RESULTS: While several methods that rely on different designs were published, only a few of them are currently available and offer a broad scope of predictions. Support and availability after publication should be required when new methods are considered for publication. Empirical analysis shows strong performance for the prediction of ion channels and modest performance for the prediction of ion channel types and voltage-gated channel classes. We identify a substantial weakness of current methods that cannot accurately predict ion channels that are categorized into multiple classes/types. CONCLUSION: Several predictors of ion channels are available to the end users. They offer practical levels of predictive quality. Methods that rely on a larger and more diverse set of predictive inputs (such as PSIONplus) are more accurate. New tools that address multi-label prediction of ion channels should be developed.


Assuntos
Biologia Computacional/métodos , Canais Iônicos/genética , Sequência de Aminoácidos , Animais , Benchmarking , Humanos , Canais Iônicos/classificação , Canais Iônicos/metabolismo
16.
BMC Bioinformatics ; 19(1): 29, 2018 02 01.
Artigo em Inglês | MEDLINE | ID: mdl-29390958

RESUMO

BACKGROUND: Protein structure can be described by backbone torsion angles: rotational angles about the N-Cα bond (φ) and the Cα-C bond (ψ) or the angle between Cαi-1-Cαi-Cαi + 1 (θ) and the rotational angle about the Cαi-Cαi + 1 bond (τ). Thus, their accurate prediction is useful for structure prediction and model refinement. Early methods predicted torsion angles in a few discrete bins whereas most recent methods have focused on prediction of angles in real, continuous values. Real value prediction, however, is unable to provide the information on probabilities of predicted angles. RESULTS: Here, we propose to predict angles in fine grids of 5° by using deep learning neural networks. We found that this grid-based technique can yield 2-6% higher accuracy in predicting angles in the same 5° bin than existing prediction techniques compared. We further demonstrate the usefulness of predicted probabilities at given angle bins in discrimination of intrinsically disorder regions and in selection of protein models. CONCLUSIONS: The proposed method may be useful for characterizing protein structure and disorder. The method is available at http://sparks-lab.org/server/SPIDER2/ as a part of SPIDER2 package.


Assuntos
Proteínas/química , Interface Usuário-Computador , Área Sob a Curva , Redes Neurais de Computação , Probabilidade , Domínios Proteicos , Estrutura Terciária de Proteína , Proteínas/metabolismo , Curva ROC
17.
Brief Bioinform ; 19(3): 482-494, 2018 05 01.
Artigo em Inglês | MEDLINE | ID: mdl-28040746

RESUMO

Protein secondary structure prediction began in 1951 when Pauling and Corey predicted helical and sheet conformations for protein polypeptide backbone even before the first protein structure was determined. Sixty-five years later, powerful new methods breathe new life into this field. The highest three-state accuracy without relying on structure templates is now at 82-84%, a number unthinkable just a few years ago. These improvements came from increasingly larger databases of protein sequences and structures for training, the use of template secondary structure information and more powerful deep learning techniques. As we are approaching to the theoretical limit of three-state prediction (88-90%), alternative to secondary structure prediction (prediction of backbone torsion angles and Cα-atom-based angles and torsion angles) not only has more room for further improvement but also allows direct prediction of three-dimensional fragment structures with constantly improved accuracy. About 20% of all 40-residue fragments in a database of 1199 non-redundant proteins have <6 Å root-mean-squared distance from the native conformations by SPIDER2. More powerful deep learning methods with improved capability of capturing long-range interactions begin to emerge as the next generation of techniques for secondary structure prediction. The time has come to finish off the final stretch of the long march towards protein secondary structure prediction.


Assuntos
Algoritmos , Biologia Computacional/métodos , Modelos Teóricos , Redes Neurais de Computação , Estrutura Secundária de Proteína , Proteínas/química , Bases de Dados de Proteínas , Humanos
18.
Curr Protein Pept Sci ; 19(2): 200-210, 2018.
Artigo em Inglês | MEDLINE | ID: mdl-28933304

RESUMO

Selection of proper targets for the X-ray crystallography will benefit biological research community immensely. Several computational models were proposed to predict propensity of successful protein production and diffraction quality crystallization from protein sequences. We reviewed a comprehensive collection of 22 such predictors that were developed in the last decade. We found that almost all of these models are easily accessible as webservers and/or standalone software and we demonstrated that some of them are widely used by the research community. We empirically evaluated and compared the predictive performance of seven representative methods. The analysis suggests that these methods produce quite accurate propensities for the diffraction-quality crystallization. We also summarized results of the first study of the relation between these predictive propensities and the resolution of the crystallizable proteins. We found that the propensities predicted by several methods are significantly higher for proteins that have high resolution structures compared to those with the low resolution structures. Moreover, we tested a new meta-predictor, MetaXXC, which averages the propensities generated by the three most accurate predictors of the diffraction-quality crystallization. MetaXXC generates putative values of resolution that have modest levels of correlation with the experimental resolutions and it offers the lowest mean absolute error when compared to the seven considered methods. We conclude that protein sequences can be used to fairly accurately predict whether their corresponding protein structures can be solved using X-ray crystallography. Moreover, we also ascertain that sequences can be used to reasonably well predict the resolution of the resulting protein crystals.


Assuntos
Proteínas/química , Sequência de Aminoácidos , Biologia Computacional , Cristalização/métodos , Cristalografia por Raios X , Bases de Dados de Proteínas , Conformação Proteica , Software , Inquéritos e Questionários
19.
Comb Chem High Throughput Screen ; 20(7): 629-637, 2017.
Artigo em Inglês | MEDLINE | ID: mdl-28292250

RESUMO

AIM AND OBJECTIVE: Lysine acetylation, as one type of post-translational modifications (PTM), plays key roles in cellular regulations and can be involved in a variety of human diseases. However, it is often high-cost and time-consuming to use traditional experimental approaches to identify the lysine acetylation sites. Therefore, effective computational methods should be developed to predict the acetylation sites. In this study, we developed a position-specific method for epsilon lysine acetylation site prediction. MATERIAL AND METHODS: Sequences of acetylated proteins were retrieved from the UniProt database. Various kinds of features such as position specific scoring matrix (PSSM), amino acid factors (AAF), and disorders were incorporated. A feature selection method based on mRMR (Maximum Relevance Minimum Redundancy) and IFS (Incremental Feature Selection) was employed. RESULTS: Finally, 319 optimal features were selected from total 541 features. Using the 319 optimal features to encode peptides, a predictor was constructed based on dagging. As a result, an accuracy of 69.56% with MCC of 0.2792 was achieved. We analyzed the optimal features, which suggested some important factors determining the lysine acetylation sites. CONCLUSION: We developed a position-specific method for epsilon lysine acetylation site prediction. A set of optimal features was selected. Analysis of the optimal features provided insights into the mechanism of lysine acetylation sites, providing guidance of experimental validation.


Assuntos
Biologia Computacional , Lisina/metabolismo , Proteínas/metabolismo , Acetilação , Bases de Dados de Proteínas , Humanos , Lisina/química , Processamento de Proteína Pós-Traducional , Proteínas/química
20.
Comput Biol Chem ; 66: 57-62, 2017 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-27918921

RESUMO

Amidation plays an important role in a variety of pathological processes and serious diseases like neural dysfunction and hypertension. However, identification of protein amidation sites through traditional experimental methods is time consuming and expensive. In this paper, we proposed a novel predictor for Prediction of Amidation Sites (PrAS), which is the first software package for academic users. The method incorporated four representative feature types, which are position-based features, physicochemical and biochemical properties features, predicted structure-based features and evolutionary information features. A novel feature selection method, positive contribution feature selection was proposed to optimize features. PrAS achieved AUC of 0.96, accuracy of 92.1%, sensitivity of 81.2%, specificity of 94.9% and MCC of 0.76 on the independent test set. PrAS is freely available at https://sourceforge.net/p/praspkg.


Assuntos
Amidas/química , Biologia Computacional , Proteínas/química , Algoritmos , Sequência de Aminoácidos , Área Sob a Curva , Processamento de Proteína Pós-Traducional , Máquina de Vetores de Suporte
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...