RESUMO
BACKGROUND: Subcellular localization prediction of protein is an important component of bioinformatics, which has great importance for drug design and other applications. A multitude of computational tools for proteins subcellular location have been developed in the recent decades, however, existing methods differ in the protein sequence representation techniques and classification algorithms adopted. RESULTS: In this paper, we firstly introduce two kinds of protein sequences encoding schemes: dipeptide information with space and Gapped k-mer information. Then, the Gapped k-mer calculation method which is based on quad-tree is also introduced. CONCLUSIONS: >From the prediction results, this method not only reduces the dimension, but also improves the prediction precision of protein subcellular localization.
Assuntos
Algoritmos , Biologia Computacional/métodos , Armazenamento e Recuperação da Informação/métodos , Proteínas/química , Frações Subcelulares/metabolismo , Sequência de Aminoácidos , Bases de Dados de Proteínas , Dipeptídeos/química , Máquina de Vetores de SuporteRESUMO
In this article, we propose a 3-dimensional graphical representation of protein sequences based on 10 physicochemical properties of 20 amino acids and the BLOSUM62 matrix. It contains evolutionary information and provides intuitive visualization. To further analyze the similarity of proteins, we extract a specific vector from the graphical representation curve. The vector is used to calculate the similarity distance between 2 protein sequences. To prove the effectiveness of our approach, we apply it to 3 real data sets. The results are consistent with the known evolution fact and show that our method is effective in phylogenetic analysis.
RESUMO
Hydroxylation of proline or lysine residues in proteins is a common post-translational modification event, and such modifications are found in many physiological and pathological processes. Nonetheless, the exact molecular mechanism of hydroxylation remains under investigation. Because experimental identification of hydroxylation is time-consuming and expensive, bioinformatics tools with high accuracy represent desirable alternatives for large-scale rapid identification of protein hydroxylation sites. In view of this, we developed a supporter vector machine-based tool, OH-PRED, for the prediction of protein hydroxylation sites using the adapted normal distribution bi-profile Bayes feature extraction in combination with the physicochemical property indexes of the amino acids. In a jackknife cross validation, OH-PRED yields an accuracy of 91.88% and a Matthew's correlation coefficient (MCC) of 0.838 for the prediction of hydroxyproline sites, and yields an accuracy of 97.42% and a MCC of 0.949 for the prediction of hydroxylysine sites. These results demonstrate that OH-PRED increased significantly the prediction accuracy of hydroxyproline and hydroxylysine sites by 7.37 and 14.09%, respectively, when compared with the latest predictor PredHydroxy. In independent tests, OH-PRED also outperforms previously published methods.
Assuntos
Aminoácidos/química , Teorema de Bayes , Biologia Computacional , Proteínas/química , Software , Algoritmos , Humanos , Hidroxilação , Distribuição Normal , Matrizes de Pontuação de Posição Específica , Máquina de Vetores de Suporte , NavegadorRESUMO
Based on the chaos game representation, a 2D graphical representation of protein sequences was introduced in which the 20 amino acids are rearranged in a cyclic order according to their physicochemical properties. The Euclidean distances between the corresponding amino acids from the 2-D graphical representations are computed to find matching (or conserved) fragments of amino acids between the two proteins. Again, the cumulative distance of the 2D-graphical representations is defined to compare the similarity of protein. And, the examination of the similarity among sequences of the ND5 proteins of nine species shows the utility of our approach.
Assuntos
Aminoácidos/análise , Aminoácidos/química , Biologia Computacional/métodos , Gráficos por Computador , NADH Desidrogenase/química , Algoritmos , Animais , Fenômenos Químicos , Simulação por Computador , Humanos , Dinâmica não Linear , Filogenia , Análise de Sequência de Proteína , Homologia de Sequência de AminoácidosRESUMO
A (two-dimensional) 2D graphical representation of protein sequences based on six physicochemical properties of amino acids is outlined. The numerical characterization of protein graphs is given as descriptors of protein sequences. It is not only useful for comparative study of proteins but also for encoding innate information about the structure of proteins. The coefficient of determination is proposed as a new similarity/dissimilarity measure. Finally, a simple example is taken to highlight the behavior of the new similarity/dissimilarity measure on protein sequences taken from the ND6 (NADH dehydrogenase subunit 6) proteins for eight different species. The results demonstrate the approach is convenient, fast, and efficient.
Assuntos
Aminoácidos/química , NADH Desidrogenase/química , Análise de Sequência de Proteína , Sequência de Aminoácidos , Animais , Gráficos por Computador , Humanos , Dados de Sequência Molecular , Subunidades Proteicas/química , Alinhamento de SequênciaRESUMO
On the basis of a selected pair of physicochemical properties of amino acids, we introduce a dynamic 2D graphical representation of protein sequences. Then, we introduce and compare two numerical characterizations of protein graphs as descriptors to analyze the nine ND5 proteins. The approach is simple, convenient, and fast.
Assuntos
Proteínas/química , Homologia de Sequência de Aminoácidos , Sequência de Aminoácidos , Aminoácidos/química , Animais , Humanos , Modelos Moleculares , NADH Desidrogenase/química , Alinhamento de SequênciaRESUMO
On the basis of a class of 2D graphical representations of DNA sequences, sensitivity analysis has been performed, showing the high-capability of the proposed representations to take into account small modifications of the DNA sequences. And sensitivity analysis also indicates that the absolute differences of the leading eigenvalues of the L/L matrices associated with DNA increase with the increase of the number of the base mutations. Besides, we conclude that the similarity analysis method based on the correlation angles can better eliminate the effects of the lengths of DNA sequences if compared with the method using the Euclidean distances. As application, the examination of similarities/dissimilarities among the coding sequences of the first exon of beta-globin gene of different species has been performed by our method, and the reasonable results verify the validity of our method.
Assuntos
Sequência de Bases , DNA/química , DNA/genética , Animais , Simulação por Computador , Éxons , Globinas/genética , Humanos , Modelos Químicos , Modelos Genéticos , Conformação de Ácido Nucleico , Teoria QuânticaRESUMO
In an earlier work [Cheng and Cui, Phys. Rev. B 72, 113112 (2005)], we have shown theoretically that extremely high power densities can be generated and transmitted in a super waveguide which is filled with homogeneous bilayers of right- and left-handed materials. In this paper, we realize such a super waveguide using right-handed transmission-line (RHTL) and left-handed transmission-line (LHTL) circuits. After a rigorous design of the RHTL-LHTL structure, we observe the generation and transmission of high-power densities in the super circuit waveguide from accurate simulation results. Both lossless and lossy cases have been studied for the LHTL circuit. From the simulation results and the rigorous analysis of energy speeds, we show that high-power flows with opposite directions are excited in the RHTL and LHTL parts of the super waveguide, which form the energy vortices in the waveguide cross section.
RESUMO
Based on the concepts of cell and system of graphical representation, a class of 2D graphical representations of RNA secondary structures are given in terms of classifications of bases of nucleic acids. The representations can completely avoid loss of information associated with crossing and overlapping of the corresponding curve. As an application, we make quantitative comparisons for a set of RNA secondary structures at the 3'-terminus of different viruses based on the graphical representations. The examination of similarities/dissimilarities illustrates the utility of the approach.
Assuntos
Conformação de Ácido Nucleico , RNA Viral/química , Algoritmos , Sequência de Bases , Simulação por Computador , Análise de Sequência de RNARESUMO
OBJECTIVE: To quantify the images of the microtubules in fetal rat cardiac myocytes under simulated microgravity by utilizing the characteristic parameters of image gray, and to study their morphological change. METHOD: Gray characteristic of the microtubules in fetal rat cardiac myocytes was quantified in both simulated microgravity and control conditions by variance, skewness, and kurtosis. RESULT: From feature analysis of 24 images, the characteristic parameters selected here were proved to be effective. Good result was obtained when discrimination between simulated microgravity group and control group was made by multivariate analysis with these parameters. The total false verdict rate even reached 16.7% when using multivariate analysis with these parameters. CONCLUSION: The morphology of the microtubules in cardiac myocytes cytoskeleton became diffused under simulated microgravity, and the quantitative analysis of gray parameters (variance, skewness, kurtosis) described the variation satisfactorily.
Assuntos
Processamento de Imagem Assistida por Computador , Microtúbulos/ultraestrutura , Miócitos Cardíacos/ultraestrutura , Simulação de Ausência de Peso , Animais , Citoesqueleto/ultraestrutura , Interpretação Estatística de Dados , Distribuição Normal , Ratos , Ratos Wistar , RotaçãoRESUMO
OBJECTIVE: To study morphological changes of the cytoskeleton-microtubule (MT) of the fetal rat cardiac myocytes under simulated microgravity, and to quantify its image by utilizing the gray level co-occurrence matrix (GLCM) parameters of the image. METHOD: Cytoskeleton images, including cellular microphotographs taken under normal or microgravity (clinostat) conditions, were quantified by gray level co-occurrence matrix parameters, and the pharmacological counter effect of quercetin against the influences of microgravity was estimated with these parameters. RESULT: The results showed that the texture of microtubules in the image became worse under simulated microgravity environment. It also showed that quercetin has certain counter effect against the influence of microgravity. CONCLUSION: The microtubule of the cardiac myocytes cytoskeleton becomes diffused under microgravity, and the GLCM parameters can well describe these variation. Quercetin has certain counter-effect against the influence of microgravity.