Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 12 de 12
Filter
Add more filters










Publication year range
1.
J Hazard Mater ; 448: 130821, 2023 04 15.
Article in English | MEDLINE | ID: mdl-36709736

ABSTRACT

Lignin, the most abundant source of renewable aromatic compounds derived from natural lignocellulosic biomass, has great potential for various applications as green materials due to its abundant active groups. However, it is still challenging to quickly construct green polymers with a certain crystallinity by utilizing lignin as a building block. Herein, new green lignin-based covalent organic polymers (LIGOPD-COPs) were one-pot fabricated with water as the reaction solvent and natural lignin as the raw material. Furthermore, by using paraformaldehyde as a protector and modulator, the LIGOPD-COPs prepared under optimized conditions displayed better crystallinity than reported lignin-based polymers, demonstrating the feasibility of preparing lignin-based polymers with improved crystallinity. The improved crystallinity confers LIGOPD-COPs with enhanced application performance, which was demonstrated by their excellent performances in sample treatment of non-targeted food safety analysis. Under optimized conditions, phytochromes, the main interfering matrices, were almost completely removed from different phytochromes-rich vegetables by LIGOPD-COPs, accompanied by "full recovery" of 90 chemical hazards. Green, low-cost, and reusable properties, together with improved crystallinity, will accelerate the industrialization and marketization of lignin-based COPs, and promote their applications in many fields.


Subject(s)
Lignin , Polymers , Lignin/chemistry , Polymers/chemistry , Biomass , Water , Solvents
2.
J Org Chem ; 84(9): 5195-5202, 2019 05 03.
Article in English | MEDLINE | ID: mdl-30892044

ABSTRACT

Capitulactones A-C, three unprecedented 9-norlignans featuring a unique 3,5-dihydrofuro[2,3- d]oxepin-7(2 H)-one scaffold, were isolated from the roots of Curculigo capitulata. Their structures with absolute configurations were unambiguously established by a combination of spectroscopic data, ECD analysis, and total synthesis. Biomimetic total syntheses of three pairs of the corresponding enantiomers were achieved in 9-10 steps with overall yields of 14.8, 12.7, and 10.3%, respectively. Notably, the unique scaffold of the common western hemisphere of the molecules was constructed by using the oxidation-reduction strategy from benzodihydrofuran.


Subject(s)
Curculigo/chemistry , Lignans/chemistry , Lignans/chemical synthesis , Chemistry Techniques, Synthetic , Models, Molecular , Molecular Conformation , Oxidation-Reduction , Stereoisomerism
3.
Bioinformatics ; 32(7): 1057-64, 2016 04 01.
Article in English | MEDLINE | ID: mdl-26614126

ABSTRACT

MOTIVATION: Identifying drug-target protein interaction is a crucial step in the process of drug research and development. Wet-lab experiment are laborious, time-consuming and expensive. Hence, there is a strong demand for the development of a novel theoretical method to identify potential interaction between drug and target protein. RESULTS: We use all known proteins and drugs to construct a nodes- and edges-weighted biological relevant interactome network. On the basis of the 'guilt-by-association' principle, novel network topology features are proposed to characterize interaction pairs and random forest algorithm is employed to identify potential drug-protein interaction. Accuracy of 92.53% derived from the 10-fold cross-validation is about 10% higher than that of the existing method. We identify 2272 potential drug-target interactions, some of which are associated with diseases, such as Torg-Winchester syndrome and rhabdomyosarcoma. The proposed method can not only accurately predict the interaction between drug molecule and target protein, but also help disease treatment and drug discovery. CONTACTS: zhanchao8052@gmail.com or ceszxy@mail.sysu.edu.cn SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Subject(s)
Drug Delivery Systems , Drug Discovery , Protein Interaction Maps , Algorithms , Humans , Protein Conformation , Proteins
4.
Anal Chim Acta ; 871: 18-27, 2015 Apr 29.
Article in English | MEDLINE | ID: mdl-25847157

ABSTRACT

Identifying potential drug target proteins is a crucial step in the process of drug discovery and plays a key role in the study of the molecular mechanisms of disease. Based on the fact that the majority of proteins exert their functions through interacting with each other, we propose a method to recognize target proteins by using the human protein-protein interaction network and graph theory. In the network, vertexes and edges are weighted by using the confidence scores of interactions and descriptors of protein primary structure, respectively. The novel network topological features are defined and employed to characterize protein using existing databases. A widely used minimum redundancy maximum relevance and random forests algorithm are utilized to select the optimal feature subset and construct model for the identification of potential drug target proteins at the proteome scale. The accuracies of training set and test set are 89.55% and 85.23%. Using the constructed model, 2127 potential drug target proteins have been recognized and 156 drug target proteins have been validated in the database of drug target. In addition, some new drug target proteins can be considered as targets for treating diseases of mucopolysaccharidosis, non-arteritic anterior ischemic optic neuropathy, Bernard-Soulier syndrome and pseudo-von Willebrand, etc. It is anticipated that the proposed method may became a powerful high-throughput virtual screening tool of drug target.


Subject(s)
Protein Interaction Mapping/methods , Protein Interaction Maps , Proteins/chemistry , Algorithms , Databases, Chemical , Databases, Protein , Drug Discovery , Humans , Models, Theoretical , Pharmaceutical Preparations/chemistry , Protein Conformation
5.
Biochim Biophys Acta ; 1844(12): 2214-21, 2014 Dec.
Article in English | MEDLINE | ID: mdl-25183318

ABSTRACT

Identifying and prioritizing disease-related genes are the most important steps for understanding the pathogenesis and discovering the therapeutic targets. The experimental examination of these genes is very expensive and laborious, and usually has a higher false positive rate. Therefore, it is highly desirable to develop computational methods for the identification and prioritization of disease-related genes. In this study, we develop a powerful method to identify and prioritize candidate disease genes. The novel network topological features with local and global information are proposed and adopted to characterize genes. The performance of these novel features is verified based on the 10-fold cross-validation test and leave-one-out cross-validation test. The proposed features are compared with the published features, and fused strategy is investigated by combining the current features with the published features. And, these combination features are also utilized to identify and prioritize Parkinson's disease-related genes. The results indicate that identified genes are highly related to some molecular process and biological function, which provides new clues for researching pathogenesis of Parkinson's disease. The source code of Matlab is freely available on request from the authors.

6.
Mol Biosyst ; 10(3): 514-25, 2014 Mar 04.
Article in English | MEDLINE | ID: mdl-24389559

ABSTRACT

Elucidating the functions of protein complexes is critical for understanding disease mechanisms, diagnosis and therapy. In this study, based on the concept that protein complexes with similar topology may have similar functions, we firstly model protein complexes as weighted graphs with nodes representing the proteins and edges indicating interaction between proteins. Secondly, we use topology features derived from the graphs to characterize protein complexes based on the graph theory. Finally, we construct a predictor by using random forest and topology features to identify the functions of protein complexes. Effectiveness of the current method is evaluated by identifying the functions of mammalian protein complexes. And then the predictor is also utilized to identify the functions of protein complexes retrieved from human protein-protein interaction networks. We identify some protein complexes with significant roles in the occurrence of tumors, vesicles and retinoblastoma. It is anticipated that the current research has an important impact on pathogenesis and the pharmaceutical industry. The source code of Matlab and the dataset are freely available on request from the authors.


Subject(s)
Models, Biological , Multiprotein Complexes/metabolism , Proteins/metabolism , Algorithms , Animals , Area Under Curve , Humans , Protein Binding , Protein Interaction Mapping/methods , ROC Curve , Reproducibility of Results
7.
Mol Biosyst ; 9(4): 658-67, 2013 Apr 05.
Article in English | MEDLINE | ID: mdl-23429850

ABSTRACT

In the post-genome era, one of the most important and challenging tasks is to identify the subcellular localizations of protein complexes, and further elucidate their functions in human health with applications to understand disease mechanisms, diagnosis and therapy. Although various experimental approaches have been developed and employed to identify the subcellular localizations of protein complexes, the laboratory technologies fall far behind the rapid accumulation of protein complexes. Therefore, it is highly desirable to develop a computational method to rapidly and reliably identify the subcellular localizations of protein complexes. In this study, a novel method is proposed for predicting subcellular localizations of mammalian protein complexes based on graph theory with a random forest algorithm. Protein complexes are modeled as weighted graphs containing nodes and edges, where nodes represent proteins, edges represent protein-protein interactions and weights are descriptors of protein primary structures. Some topological structure features are proposed and adopted to characterize protein complexes based on graph theory. Random forest is employed to construct a model and predict subcellular localizations of protein complexes. Accuracies on a training set by a 10-fold cross-validation test for predicting plasma membrane/membrane attached, cytoplasm and nucleus are 84.78%, 71.30%, and 82.00%, respectively. And accuracies for the independent test set are 81.31%, 69.95% and 81.00%, respectively. These high prediction accuracies exhibit the state-of-the-art performance of the current method. It is anticipated that the proposed method may become a useful high-throughput tool and plays a complementary role to the existing experimental techniques in identifying subcellular localizations of mammalian protein complexes. The source code of Matlab and the dataset can be obtained freely on request from the authors.


Subject(s)
Models, Biological , Multiprotein Complexes/metabolism , Proteins/chemistry , Algorithms , Animals , Humans , Intracellular Space , Protein Transport , ROC Curve , Reproducibility of Results
8.
J Proteomics ; 75(8): 2500-13, 2012 Apr 18.
Article in English | MEDLINE | ID: mdl-22415277

ABSTRACT

A proteome-wide network approach was performed to characterize significant patterns of influenza A virus (IAV)-human interactions, and to further identify potentially valuable targets for prophylactic and therapeutic interventions. Topological analysis demonstrated a strong tendency for IAV to interplay with highly connected and central proteins located in sparsely connected sub-networks. Additionally, functional analysis based on biological process revealed a number of functional groups overrepresented for IAV interactions, in which regulation of cell death and apoptosis, and phosphorus metabolic process is the most highly enriched. In order to investigate whether these topological and biological features are significant enough to distinguish IAV targets from human proteome, a discrimination model was constructed based on these features using support vector machine coupled with genetic algorithm. The average result of overall prediction accuracy is 71.04% by leave-one-out across validation test. The optimized classifier was then applied to 9706 human proteins. As a result, 1418 novel genes were identified from human interactome, some of which were experimentally validated by others' works to be important for IAV infection. The findings presented in this study might be important in discovering new drug targets for therapeutic treatments as well as revealing topological features and functional properties specific for viral infection.


Subject(s)
Host-Pathogen Interactions , Influenza A virus/physiology , Influenza, Human/metabolism , Protein Interaction Mapping/methods , Proteins/isolation & purification , Proteome/analysis , Algorithms , Cluster Analysis , Host-Pathogen Interactions/immunology , Host-Pathogen Interactions/physiology , Humans , Influenza A virus/immunology , Influenza, Human/immunology , Metabolic Networks and Pathways/immunology , Metabolic Networks and Pathways/physiology , Proteins/analysis , Proteins/metabolism , Proteome/metabolism , Sequence Analysis, Protein/methods , Support Vector Machine , Validation Studies as Topic
9.
Anal Chim Acta ; 718: 32-41, 2012 Mar 09.
Article in English | MEDLINE | ID: mdl-22305895

ABSTRACT

In the post-genomic era, one of the most important and challenging tasks is to identify protein complexes and further elucidate its molecular mechanisms in specific biological processes. Previous computational approaches usually identify protein complexes from protein interaction network based on dense sub-graphs and incomplete priori information. Additionally, the computational approaches have little concern about the biological properties of proteins and there is no a common evaluation metric to evaluate the performance. So, it is necessary to construct novel method for identifying protein complexes and elucidating the function of protein complexes. In this study, a novel approach is proposed to identify protein complexes using random forest and topological structure. Each protein complex is represented by a graph of interactions, where descriptor of the protein primary structure is used to characterize biological properties of protein and vertex is weighted by the descriptor. The topological structure features are developed and used to characterize protein complexes. Random forest algorithm is utilized to build prediction model and identify protein complexes from local sub-graphs instead of dense sub-graphs. As a demonstration, the proposed approach is applied to protein interaction data in human, and the satisfied results are obtained with accuracy of 80.24%, sensitivity of 81.94%, specificity of 80.07%, and Matthew's correlation coefficient of 0.4087 in 10-fold cross-validation test. Some new protein complexes are identified, and analysis based on Gene Ontology shows that the complexes are likely to be true complexes and play important roles in the pathogenesis of some diseases. PCI-RFTS, a corresponding executable program for protein complexes identification, can be acquired freely on request from the authors.


Subject(s)
Algorithms , Protein Interaction Mapping/methods , Protein Interaction Maps , Proteins/metabolism , Computational Biology/methods , Computer Simulation , Humans , Models, Biological , Models, Molecular , Proteins/genetics
10.
Anal Chim Acta ; 703(2): 163-71, 2011 Oct 10.
Article in English | MEDLINE | ID: mdl-21889630

ABSTRACT

Protein methylation is involved in dozens of biological processes and plays an important role in adjusting protein physicochemical properties, conformation and function. However, with the rapid increase of protein sequence entering into databanks, the gap between the number of known sequence and the number of known methylation annotation is widening rapidly. Therefore, it is vitally significant to develop a computational method for quick and accurate identification of methylation sites. In this study, a novel predictor (Methy_SVMIACO) based on support vector machine (SVM) and improved ant colony optimization algorithm (IACO) is developed to identify methylation sites. The IACO is utilized to find the optimal feature subset and parameter of SVM, while SVM is employed to perform the identification of methylation sites. Comparison of the IACO with conventional ACO shows that the IACO converges quickly toward the global optimal solution and it is more useful tool for feature selection and SVM parameter optimization. The performance of Methy_SVMIACO is evaluated with a sensitivity of 85.71%, a specificity of 86.67%, an accuracy of 86.19% and a Matthew's correlation coefficient (MCC) of 0.7238 for lysine as well as a sensitivity of 89.08%, a specificity of 94.07%, an accuracy of 91.56% and a MCC of 0.8323 for arginine in 10-fold cross-validation test. It is shown through the analysis of the optimal feature subset that some upstream and downstream residues play important role in the methylation of arginine and lysine. Compared with other existing methods, the Methy_SVMIACO provides higher Acc, Sen and Spe, indicating that the current method may serve as a powerful complementary tool to other existing approaches in this area. The Methy_SVMIACO can be acquired freely on request from the authors.


Subject(s)
Algorithms , Proteins/metabolism , Support Vector Machine , Arginine/chemistry , Arginine/metabolism , Databases, Protein , Lysine/chemistry , Lysine/metabolism , Methylation , Proteins/chemistry
11.
Amino Acids ; 37(2): 415-25, 2009 Jul.
Article in English | MEDLINE | ID: mdl-18726140

ABSTRACT

A prior knowledge of protein structural classes can provide useful information about its overall structure, so it is very important for quick and accurate determination of protein structural class with computation method in protein science. One of the key for computation method is accurate protein sample representation. Here, based on the concept of Chou's pseudo-amino acid composition (AAC, Chou, Proteins: structure, function, and genetics, 43:246-255, 2001), a novel method of feature extraction that combined continuous wavelet transform (CWT) with principal component analysis (PCA) was introduced for the prediction of protein structural classes. Firstly, the digital signal was obtained by mapping each amino acid according to various physicochemical properties. Secondly, CWT was utilized to extract new feature vector based on wavelet power spectrum (WPS), which contains more abundant information of sequence order in frequency domain and time domain, and PCA was then used to reorganize the feature vector to decrease information redundancy and computational complexity. Finally, a pseudo-amino acid composition feature vector was further formed to represent primary sequence by coupling AAC vector with a set of new feature vector of WPS in an orthogonal space by PCA. As a showcase, the rigorous jackknife cross-validation test was performed on the working datasets. The results indicated that prediction quality has been improved, and the current approach of protein representation may serve as a useful complementary vehicle in classifying other attributes of proteins, such as enzyme family class, subcellular localization, membrane protein types and protein secondary structure, etc.


Subject(s)
Principal Component Analysis , Protein Conformation , Proteins/chemistry , Sequence Analysis, Protein/methods , Algorithms , Amino Acid Sequence , Amino Acids/chemistry , Databases, Protein , Molecular Sequence Data , Proteins/genetics
12.
J Theor Biol ; 248(3): 546-51, 2007 Oct 07.
Article in English | MEDLINE | ID: mdl-17628605

ABSTRACT

With the rapid increment of protein sequence data, it is indispensable to develop automated and reliable predictive methods for protein function annotation. One approach for facilitating protein function prediction is to classify proteins into functional families from primary sequence. Being the most important group of all proteins, the accurate prediction for enzyme family classes and subfamily classes is closely related to their biological functions. In this paper, for the prediction of enzyme subfamily classes, the Chou's amphiphilic pseudo-amino acid composition [Chou, K.C., 2005. Using amphiphilic pseudo amino acid composition to predict enzyme subfamily classes. Bioinformatics 21, 10-19] has been adopted to represent the protein samples for training the 'one-versus-rest' support vector machine. As a demonstration, the jackknife test was performed on the dataset that contains 2640 oxidoreductase sequences classified into 16 subfamily classes [Chou, K.C., Elrod, D.W., 2003. Prediction of enzyme family classes. J. Proteome Res. 2, 183-190]. The overall accuracy thus obtained was 80.87%. The significant enhancement in the accuracy indicates that the current method might play a complementary role to the exiting methods.


Subject(s)
Amino Acid Sequence , Enzymes/classification , Algorithms , Artificial Intelligence , Computational Biology , Proteins/chemistry , Reproducibility of Results , Sequence Analysis, Protein , Software
SELECTION OF CITATIONS
SEARCH DETAIL
...