Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 17 de 17
Filter
Add more filters










Publication year range
1.
BMC Genomics ; 20(Suppl 13): 980, 2019 Dec 27.
Article in English | MEDLINE | ID: mdl-31881832

ABSTRACT

BACKGROUND: The three-dimensional (3D) structure of chromatins plays significant roles during cell differentiation and development. Hi-C and other 3C-based technologies allow us to look deep into the chromatin architectures. Many studies have suggested that topologically associating domains (TAD), as the structure and functional unit, are conserved across different organs. However, our understanding about the underlying mechanism of the TAD boundary formation is still limited. RESULTS: We developed a computational method, TAD-Lactuca, to infer this structure by taking the contextual information of the epigenetic modification signals and the primary DNA sequence information on the genome. TAD-Lactuca is found stable in the case of multi-resolutions and different datasets. It could achieve high accuracy and even outperforms the state-of-art methods when the sequence patterns were incorporated. Moreover, several transcript factor binding motifs, besides the well-known CCCTC-binding factor (CTCF) motif, were found significantly enriched on the boundaries. CONCLUSIONS: We provided a low cost, effective method to predict TAD boundaries. Above results suggested the incorporation of sequence features could significantly improve the performance. The sequence motif enrichment analysis indicates several gene regulation motifs around the boundaries, which is consistent with TADs may serve as the functional units of gene regulation and implies the sequence patterns would be important in chromatin folding.


Subject(s)
Histones/chemistry , Neural Networks, Computer , Algorithms , Area Under Curve , Chromatin/metabolism , Histone Code , Histones/metabolism , Protein Binding , ROC Curve
2.
Sichuan Da Xue Xue Bao Yi Xue Ban ; 50(1): 55-60, 2019 Jan.
Article in Chinese | MEDLINE | ID: mdl-31037905

ABSTRACT

OBJECTIVE: To investigate the effect of small interfering RNA of lactate dehydrogenase A (siLDHA) on migration and invasion of epidermal growth factor receptor 2 (ErbB2) over expressing breast cancer cell line SK-BR-3, MDA-MB-453 and its molecular mechanism. METHODS: SK-BR-3 and MDA-MB-453 cells were transfected with siLDHA to interfere with the expression of LDHA. The transfection of scramble siRNA was used as negative control. The LDHA protein levels were detected by Western blot ( P<0.01). Cell migration and invasion was detected by Transwell assays. Lactate dehydrogenase (LDH) activity was measured by LDH assay kit. The glucose and lactate concentration in the culture media was determined by glucose and lactate assay kit, respectively, and then glucose uptake and lactate production by the cells were calculated. RESULTS: siLDHA downregulated LDHA protein levels in SK-BR-3 and MDA-MB-453 cells ( P<0.01). Compared with negative control group, siLDHA significantly decreased migration and invasion of SK-BR-3 and MDA-MB-453 cells ( P<0.001). siLDHA reduced LDH activity in SK-BR-3 cells, glucose uptake and lactate production in SK-BR-3 and MDA-MB-453 cells, the difference was significant ( P<0.05). CONCLUSION: Knockdown of LDHA by siRNA inhibits the migration and invasion via downregulation of glycolysis in ErbB2 over expressing breast cancer cell line.


Subject(s)
Breast Neoplasms , L-Lactate Dehydrogenase/genetics , Cell Line, Tumor , Cell Movement , Cell Proliferation , Humans , MCF-7 Cells , RNA, Small Interfering , Receptor, ErbB-2
3.
4.
Protein Pept Lett ; 19(5): 559-66, 2012 May.
Article in English | MEDLINE | ID: mdl-22316310

ABSTRACT

Nicotinamide adenine dinucleotide (NAD) plays an important role in cellular metabolism and acts as hydrideaccepting and hydride-donating coenzymes in energy production. Identification of NAD protein interacting sites can significantly aid in understanding the NAD dependent metabolism and pathways, and it could further contribute useful information for drug development. In this study, a computational method is proposed to predict NAD-protein interacting sites using the sequence information and structure-based information. All models developed in this work are evaluated using the 7-fold cross validation technique. Results show that using the position specific scoring matrix (PSSM) as an input feature is quite encouraging for predicting NAD interacting sites. After considering the unbalance dataset, the ensemble support vector machine (SVM), which is an assembly of many individual SVM classifiers, is developed to predict the NAD interacting sites. It was observed that the overall accuracy (Acc) thus obtained was 87.31% with Matthew's correlation coefficient (MCC) equal to 0.56. In contrast, the corresponding rate by the single SVM approach was only 80.86% with MCC of 0.38. These results indicated that the prediction accuracy could be remarkably improved via the ensemble SVM classifier approach.


Subject(s)
Computational Biology/methods , NAD/chemistry , Protein Interaction Domains and Motifs , Support Vector Machine , Binding Sites , Databases, Protein , NAD/metabolism , Position-Specific Scoring Matrices , Reproducibility of Results , Second Messenger Systems
5.
Protein Pept Lett ; 18(9): 906-11, 2011 Sep.
Article in English | MEDLINE | ID: mdl-21529343

ABSTRACT

Protein-protein interactions (PPIs) are crucial to most biochemical processes in human beings. Although many human PPIs have been identified by experiments, the number is still limited compared to the available protein sequences of human organisms. Recently, many computational methods have been proposed to facilitate the recognition of novel human PPIs. However the existing methods only concentrated on the information of individual PPI, while the systematic characteristic of protein-protein interaction networks (PINs) was ignored. In this study, a new method was proposed by combining the global information of PINs and protein sequence information. Random forest (RF) algorithm was implemented to develop the prediction model, and a high accuracy of 91.88% was obtained. Furthermore, the RF model was tested using three independent datasets with good performances, suggesting that our method is a useful tool for identification of PPIs and investigation into PINs as well.


Subject(s)
Algorithms , Protein Interaction Mapping/methods , Proteins/metabolism , Databases, Protein , Humans , Metabolic Networks and Pathways , Models, Biological , Sequence Analysis, Protein/methods
6.
Interdiscip Sci ; 3(2): 121-7, 2011 Jun.
Article in English | MEDLINE | ID: mdl-21541841

ABSTRACT

A quantitative structure-activity relationship (QSAR) of a series of tanshinone compounds with cytotoxicity against murine leukemia cell lines P-388 has been studied using density functional theory (DFT) method combined with statistical analysis. Four main independent factors contributing to the cytotoxicity including the maximum molecular electrostatic potential at the SAS surface (SAS (max)), the average nucleophilic superdelocalizability (ANS), the dihedral between ring A and B (u) and the net atomic charge of C (12) (Q(C (12))) were selected by stepwise multiple regression method, then the QSAR equation was established via multiple linear regression (MLR) analysis. These descriptors accounted for 74.2% of the variation in the in vitro biological activity among the tanshinone analogues. The QSAR equation was used to estimate the cytotoxicity for new compounds of this series by calculating the four descriptors. Based on this model, six new compounds with higher cytotoxicity were theoretically designed.


Subject(s)
Abietanes/pharmacology , Quantitative Structure-Activity Relationship , Abietanes/chemistry , Animals , Cell Death/drug effects , Cell Line, Tumor , Cell Proliferation/drug effects , Drug Screening Assays, Antitumor , Mice , Models, Molecular , Reproducibility of Results
7.
Oral Oncol ; 47(5): 430-5, 2011 May.
Article in English | MEDLINE | ID: mdl-21439894

ABSTRACT

Preoperative diagnosis of neoplasms in the parotid gland is essential for successful surgical treatment. The purpose of this study is to apply Raman spectroscopy in order to distinguish the spectral differences between pleomorphic adenoma and Warthin tumor from that of normal parotid gland tissues. Furthermore we establish the diagnostic model of the Raman spectra of neoplasms in parotid gland by employing support vector machine (SVM) with Gaussian radial basis function. Firstly, Raman spectra from different histopathological tissues were obtained by near-infrared Raman microscope, SVM was then employed to analyze the different spectra and establish a discriminating model. As a result, the differences of peaks in the region 800-1800 cm(-1) demonstrated the biochemical molecular alterations between different histopathological tissues. Compared with normal parotid gland tissues, the content of proteins, lipids and DNA increased in pleomorphic adenoma. The content of DNA increased but proteins and lipids decreased in Warthin tumor. SVM displayed a powerful role in the classification of three different groups. The sensitivities and specificities of discrimination between different groups reached above 95% and 99%, respectively. Raman spectroscopy combined SVM algorithm could have great potential for providing a noninvasive, effective and accurate diagnostic technology for neoplasm diagnosis in the parotid gland.


Subject(s)
Adenolymphoma/diagnosis , Adenoma, Pleomorphic/diagnosis , Parotid Gland , Parotid Neoplasms/diagnosis , Spectrum Analysis, Raman/methods , Adenolymphoma/pathology , Adenoma, Pleomorphic/pathology , Adolescent , Adult , Aged , Algorithms , Diagnosis, Differential , Female , Humans , Male , Middle Aged , Parotid Neoplasms/pathology , Sensitivity and Specificity , Young Adult
8.
Protein Pept Lett ; 18(5): 450-6, 2011 May.
Article in English | MEDLINE | ID: mdl-21171945

ABSTRACT

B-factor from X-ray crystal structure can well measure protein structural flexibility, which plays an important role in different biological processes, such as catalysis, binding and molecular recognition. Understanding the essence of flexibility can be helpful for the further study of the protein function. In this study, we attempted to correlate the flexibility of a residue to its interactions with other residues by representing the protein structure as a residue contact network. Here, several well established network topological parameters were employed to feature such interactions. A prediction model was constructed for B-factor of a residue by using support vector regression (SVR). Pearson correlation coefficient (CC) was used as the performance measure. CC values were 0.63 and 0.62 for single amino acid and for the whole sequence, respectively. Our results revealed well correlations between B-factors and network topological parameters. This suggests that the protein structural flexibility could be well characterized by the inter-amino acid interactions in a protein.


Subject(s)
Pliability , Sequence Analysis, Protein/methods , Statistics as Topic/methods , Computational Biology , Crystallography, X-Ray , Models, Molecular , Protein Conformation , Protein Interaction Mapping/methods , Reproducibility of Results
9.
Interdiscip Sci ; 2(3): 241-6, 2010 Sep.
Article in English | MEDLINE | ID: mdl-20658336

ABSTRACT

Neurotoxin is a toxin which acts on nerve cells by interacting with membrane proteins. Different neurotoxins have different functions and sources. With much more knowledge of neurotoxins it would be greatly helpful for the development of drug design. The support vector machine (SVM) was used to predict the neurotoxin based on multiple feature vector descriptors, including the amino acid composition, length of the protein sequence, weight of the protein and the evolution information described by position specific scoring matrix (PSSM). After a five-fold cross-validation procedure, the method achieved an accuracy of 100% in discriminating neurotoxins from non-toxins. As for classifying neurotoxins based on their sources and functions, the accuracy was 99.50% and 99.38% respectively. At last, the method yielded a good performance in sub-classification of ion channels inhibitors with the total accuracy of 87.27%. These results indicate that this method outperforms previously described NTXpred method.


Subject(s)
Amino Acid Sequence , Amino Acids , Neurotoxins/chemistry , Support Vector Machine , Molecular Weight , Neurotoxins/classification , Reproducibility of Results
10.
Hua Xi Kou Qiang Yi Xue Za Zhi ; 28(1): 61-4, 2010 Feb.
Article in Chinese | MEDLINE | ID: mdl-20337078

ABSTRACT

OBJECTIVE: To evaluate the value of the near infrared Raman spectroscope in diagnosing oral squamous cell carcinoma (OSCC). METHODS: Near infrared Raman spectra of ten normal mucosa, twenty OSCC and thirty oral leukoplakia (OLK) cases were collected in the research. Based on the previous researches, the information of the subtracted spectra of compared group was gained by the characteristic band in them. A Gaussian radial basis function support vector machine was used to classify spectra and establish the diagnostic models. The efficacy and validity of the algorithm were evaluated. RESULTS: By analyzing the subtracted mean spectra, the increasing peak intensity in wavenumber range of 500-2 200 cm(-1) hinted us of the high contents of DNA, protein and lipid in OSCC, which elucidate the high proliferative activity. The increasing peak intensity in the wavenumber range of 500-2 200 cm(-1) hinted us of the high contents of DNA, protein and lipid in OSCC, which elucidate the high proliferative activity, but the difference between OLK and OSCC was not as much as that between normal and OSCC. The Gaussian radial basis function support vector machine showed powerful ability in grouping and modeling of normal and OSCC, and the specificity, sensitivity and accuracy were 100%, 97.44% and 98.81% correspondingly. The algorithm showed good ability in grouping and modeling of OLK and OSCC, the specificity, sensitivity and accuracy were 95.00%, 86.36% and 96.30%. CONCLUSION: Combined with support vector machines, near infrared Raman spectroscopy could detect the biochemical variations in oral normal, OLK and OSCC, and establish diagnostic model accurately.


Subject(s)
Leukoplakia, Oral , Mouth Mucosa , Carcinoma, Squamous Cell , Humans , Sensitivity and Specificity , Spectrum Analysis, Raman
11.
Protein J ; 29(1): 62-7, 2010 Jan.
Article in English | MEDLINE | ID: mdl-20049515

ABSTRACT

The purpose of this article is to identify protein structural classes by using support vector machine (SVM) ensemble classifier, which is very efficient in enhancing prediction performance. Firstly, auto covariance (AC) and pseudo-amino acid composition (PseAAC) were used in protein representation. AC focuses on adjacent effects and PseAA composition takes sequence order patterns into account. Secondly, SVMs were trained on the datasets represented by different descriptors. The last, ensemble classifier, which constructed on the individual classifiers through a voting strategy, gave the final prediction results. Meanwhile, very promising prediction accuracy 93.14% was obtained by Jackknife test. The experimental results showed that the ensemble system can improve the prediction performance greatly and generate more stable and safer predictors. The current method featured by fusing the protein primary sequence information transferred by AC and described by protein PseAA composition may play an important complementary role in other related applications.


Subject(s)
Amino Acids/analysis , Computational Biology/methods , Proteins/chemistry , Software , Protein Conformation
12.
J Theor Biol ; 259(2): 366-72, 2009 Jul 21.
Article in English | MEDLINE | ID: mdl-19341746

ABSTRACT

The submitochondria location of a mitochondrial protein is very important for further understanding the structure and function of this protein. Hence, it is of great practical significance to develop an automated and reliable method for timely identifying the submitochondria locations of novel mitochondrial proteins. In this study, a sequence-based algorithm combining the augmented Chou's pseudo amino acid composition (Chou's PseAA) based on auto covariance (AC) is developed to predict protein submitochondria locations and membrane protein types in mitochondria inner membrane. The model fully considers the sequence-order effects between residues a certain distance apart in the sequence by AC combined with eight representative descriptors for both common proteins and membrane proteins. As a result of jackknife cross-validation tests, the method for submitochondria location prediction yields the accuracies of 91.8%, 96.4% and 66.1% for inner membrane, matrix, and outer membrane, respectively. The total accuracy is 89.7%. When predicting membrane protein types in mitochondria inner membrane, the method achieves the prediction performance with the accuracies of 98.4%, 64.3% and 86.7% for multi-pass inner membrane, single-pass inner membrane, and matrix side inner membrane, where the total accuracy is 93.6%. The overall performance of our method is better than the achievements of the previous studies. So our method can be an effective supplementary tool for future proteomics studies. The prediction software and all data sets used in this article are freely available at http://chemlab.scu.edu.cn/Predict_subMITO/index.htm.


Subject(s)
Amino Acids/analysis , Mitochondrial Proteins/analysis , Models, Chemical , Animals , Chemistry, Physical , Membrane Proteins/analysis , Pattern Recognition, Automated
13.
Interdiscip Sci ; 1(4): 315-9, 2009 Dec.
Article in English | MEDLINE | ID: mdl-20640811

ABSTRACT

Machine learning methods play the very important role in protein secondary structure prediction and other related works. On condition of a certain approach, the prediction qualities mostly depend on the ways of representing protein sequences into numeric features. In this paper, two Support Vector Machine (SVM) multi-classification strategies, "one-against-one" (1-a-1) and "one-against-all" (1-a-a), were used in protein structural classes identification. Auto covariance (AC), which transforms the physicochemical properties of the amino acids of the proteins into a data matrix, focuses on the neighboring effects and the interactions between residues in protein sequences. "1-a-1" approach was used on SVM to predict protein structural classes and obtained very promising overall accuracy 90.69% by Jackknife test. It was more than 10% higher than the accuracy obtained by using "1-a-a". Experimental results led to the finding that the SVM predictor constructed by "1-a-1" can avoid the appearance of biased prediction accuracy. This current method, using the protein primary sequence information described by auto covariance (AC) and "1-a-1" approach on SVM, should play an important complementary role in other related applications.


Subject(s)
Artificial Intelligence , Computational Biology/methods , Proteins/chemistry , Proteins/classification , Algorithms , Computer Simulation , Genetic Vectors , Pattern Recognition, Automated/methods , Protein Structure, Secondary , Reproducibility of Results , Sequence Analysis, Protein/methods , Software
14.
Interdiscip Sci ; 1(2): 151-5, 2009 Jun.
Article in English | MEDLINE | ID: mdl-20640829

ABSTRACT

Pattern recognition methods could be of great help to disease diagnosis. In this study, a semi-supervised learning based method, Laplacian support vector machine (LapSVM), was used in diabetes diseases prediction. The diabetes disease dataset used in this article is Pima Indians diabetes dataset obtained from the UCI Repository of Machine Learning Databases and all patients in the dataset are females at least 21 years old of Pima Indian heritage. Firstly, LapSVM was trained as a fully-supervised learning classifier to predict diabetes dataset and 79.17% accuracy was obtained. Then, it was trained as a semi-supervised learning classifier and we got the prediction accuracy 82.29%. The obtained accuracy 82.29% is higher than other previous reports. The experiments led to the finding that LapSVM offers a very promising application, i.e., LapSVM can be used to solve a fully-supervised learning problem by solving a semi-supervised learning problem. The result suggests that LapSVM can be of great help to physicians in the process of diagnosing diabetes disease and it could be a very promising method in the situations where a lot of data are not class-labeled.


Subject(s)
Artificial Intelligence , Decision Support Techniques , Diabetes Mellitus/diagnosis , Algorithms , Computer Simulation , Computers , Databases, Factual , Diabetes Mellitus/ethnology , Female , Humans , Indians, North American , Models, Statistical , Models, Theoretical , Reproducibility of Results
15.
Acta Biochim Biophys Sin (Shanghai) ; 38(6): 363-71, 2006 Jun.
Article in English | MEDLINE | ID: mdl-16761093

ABSTRACT

In our previous work, we developed a computational tool, PreK-ClassK-ClassKv, to predict and classify potassium (K+) channels. For K+ channel prediction (PreK) and classification at family level (ClassK), this method performs well. However, it does not perform so well in classifying voltage-gated potassium (Kv) channels (ClassKv). In this paper, a new method based on the local sequence information of Kv channels is introduced to classify Kv channels. Six transmembrane domains of a Kv channel protein are used to define a protein, and the dipeptide composition technique is used to transform an amino acid sequence to a numerical sequence. A Kv channel protein is represented by a vector with 2000 elements, and a support vector machine algorithm is applied to classify Kv channels. This method shows good performance with averages of total accuracy (Acc), sensitivity (SE), specificity (SP), reliability (R) and Matthews correlation coefficient (MCC) of 98.0%, 89.9%, 100%, 0.95 and 0.94 respectively. The results indicate that the local sequence information-based method is better than the global sequence information-based method to classify Kv channels.


Subject(s)
Potassium Channels, Voltage-Gated/genetics , Algorithms , Animals , Artificial Intelligence , Computational Biology/methods , Humans , Models, Biological , Models, Statistical , Peptides/chemistry , Potassium Channels, Voltage-Gated/classification , Reproducibility of Results , Sensitivity and Specificity , Sequence Alignment , Sequence Analysis, Protein/methods
16.
Acta Biochim Biophys Sin (Shanghai) ; 37(11): 759-66, 2005 Nov.
Article in English | MEDLINE | ID: mdl-16270155

ABSTRACT

Although the sequence information on G-protein coupled receptors (GPCRs) continues to grow, many GPCRs remain orphaned (i.e. ligand specificity unknown) or poorly characterized with little structural information available, so an automated and reliable method is badly needed to facilitate the identification of novel receptors. In this study, a method of fast Fourier transform-based support vector machine has been developed for predicting GPCR subfamilies according to protein's hydrophobicity. In classifying Class B, C, D and F subfamilies, the method achieved an overall Matthe's correlation coefficient and accuracy of 0.95 and 93.3%, respectively, when evaluated using the jackknife test. The method achieved an accuracy of 100% on the Class B independent dataset. The results show that this method can classify GPCR subfamilies as well as their functional classification with high accuracy. A web server implementing the prediction is available at http://chem.scu.edu.cn/blast/Pred-GPCR.


Subject(s)
Algorithms , Artificial Intelligence , Models, Chemical , Receptors, G-Protein-Coupled/chemistry , Receptors, G-Protein-Coupled/classification , Sequence Alignment/methods , Sequence Analysis, Protein/methods , Amino Acid Sequence , Computer Simulation , Fourier Analysis , Internet , Molecular Sequence Data , Pattern Recognition, Automated/methods , Receptors, G-Protein-Coupled/analysis , Sequence Homology, Amino Acid
17.
Comput Biol Chem ; 29(3): 220-8, 2005 Jun.
Article in English | MEDLINE | ID: mdl-15979042

ABSTRACT

This paper applies discrete wavelet transform (DWT) with various protein substitution models to find functional similarity of proteins with low identity. A new metric, 'S' function, based on the DWT is proposed to measure the pair-wise similarity. We also develop a segmentation technique, combined with DWT, to handle long protein sequences. The results are compared with those using the pair-wise alignment and PSI-BLAST.


Subject(s)
Amino Acid Sequence , Amino Acid Substitution , Structural Homology, Protein , Computer Simulation
SELECTION OF CITATIONS
SEARCH DETAIL
...