Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 18 de 18
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
J Biomol Struct Dyn ; : 1-9, 2023 Dec 11.
Artigo em Inglês | MEDLINE | ID: mdl-38079339

RESUMO

Discovery of intrinsically disordered proteins (IDPs) and protein hybrids that contain both intrinsically disordered protein regions (IDPRs) along with ordered regions has changed the sequence-structure-function paradigm of protein. These proteins with lack of persistently fixed structure are often found in all organisms and play vital roles in various biological processes. Some of them are considered as potential drug targets due to their overrepresentation in pathophysiological processes. The major bottlenecks for characterizing such proteins are their occasional overexpression, difficulty in getting purified homogeneous form and the challenge of investigating them experimentally. Sequence-based prediction of intrinsic disorder remains a useful strategy especially for many large-scale proteomic investigations. However, worst accuracy still occurs for short disordered regions with less than ten residues, for the residues close to order-disorder boundaries, for regions that undergo coupled folding and binding in presence of partner, and for prediction of fully disordered proteins. Annotation of fully disordered proteins mostly relies on the far-UV circular dichroism experiment which gives overall secondary structure composition without residue-level resolution. Current methods including that using secondary structure information failed to predict half of target IDPs correctly in the recent Critical Assessment of protein Intrinsic Disorder prediction (CAID) experiment. This study utilized profiles of random sequential appearance of physicochemical properties of amino acids and random sequential appearance of order and disorder promoting amino acids in protein together with the existing CIDER feature for the prediction of IDP from sequence input. Our method was found to significantly outperform the existing predictors across different datasets.Communicated by Ramaswamy H. Sarma.

2.
IEEE/ACM Trans Comput Biol Bioinform ; 20(3): 2112-2121, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-37018272

RESUMO

Among new protein structure predictors, the recently developed AlphaFold predictor relies on contact map in line with contact map potential based threading model that basically relies on fold recognition. In parallel, sequence similarity based homology model relies on homologue recognition. Both of these methods rely on sequence-structure or sequence-sequence similarity with protein with known structure in absence of which, as argued in the development of AlphaFold, the structure prediction becomes quite challenging. However, the term, "known structure" depends on the similarity method adopted to identify it, for example, through sequence match yielding homologue or sequence-structure match yielding a fold. Also, quite often, AlphaFold structures are found to be not acceptable by the structure evaluating gold standard parameters. In this context, this work utilized the concept of ordered local physicochemical property, ProtPCV by Pal et al (2020) providing a new similarity criteria to identify the template protein with known structure. Finally a template search engine, TemPred was developed using the ProtPCV similarity criteria. It was intriguing to find that quite often templates generated by TemPred were better than that produced by the conventional search engines. It pointed out the need of combined approach to get better structural model for a protein.


Assuntos
Algoritmos , Ferramenta de Busca , Modelos Moleculares , Proteínas/química , Software
3.
BMC Bioinformatics ; 24(1): 148, 2023 Apr 17.
Artigo em Inglês | MEDLINE | ID: mdl-37069509

RESUMO

BACKGROUND: Concurrent existence of lncRNA and circular RNA at both nucleus and cytosol within a cell at different proportions is well reported. Previous studies showed that circular RNAs are synthesized in nucleus followed by transportation across the nuclear membrane and the export is primarily defined by their length. lncRNAs primarily originated through inefficient splicing and seem to use NXF1 for cytoplasm export. However, it is not clear whether circularization of lncRNA happens only in nucleus or it also occurs in cytoplasm. Studies indicate that circular RNAs arise when the splicing apparatus undergoes a phenomenon of back splicing. Minor spliceosome (U12 type) mediated splicing occurs in cytoplasm and is responsible for the splicing of 0.5% of introns of human cells. Therefore, possibility of cRNA biogenesis mediated by minor spliceosome at cytoplasm cannot be ruled out. Secondly, information on genes transcribing both circular and lncRNAs along with total number of RBP binding sites for both of these RNA types is extractable from databases. This study showed how these apparently unconnected pieces of reports could be put together to build a model for exploring biogenesis of circular RNA. RESULTS: As a result of this study, a model was built under the premises that, sequences with special semantics were molecular precursors in biogenesis of circular RNA which occurred through catalytic role of some specific RBPs. The model outcome was further strengthened by fulfillment of three logical lemmas which were extracted and assimilated in this work using a novel data analytic approach, Integrated Cellular Geography. Result of the study was found to be in well agreement with proposed model. Furthermore this study also indicated that biogenesis of circular RNA was a post-transcriptional event. CONCLUSIONS: Overall, this study provides a novel systems biology based model under the paradigm of Integrated Cellular Geography which can assimilate independently performed experimental results and data published by global researchers on RNA biology to provide important information on biogenesis of circular RNAs considering lncRNAs as precursor molecule. This study also suggests the possible RBP-mediated circularization of RNA in the cytoplasm through back-splicing using minor spliceosome.


Assuntos
RNA Circular , RNA Longo não Codificante , Humanos , RNA Circular/metabolismo , RNA Longo não Codificante/genética , RNA Longo não Codificante/metabolismo , Semântica , RNA/química , Splicing de RNA , Íntrons , Precursores de RNA/genética
4.
IEEE/ACM Trans Comput Biol Bioinform ; 18(5): 1864-1874, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-31825870

RESUMO

Out of currently available semi-automatic tools for detecting diagnostic probes relevant to a pathophysiological condition, ArrayMining and GEO2R of NCBI are most popular. The shortcomings of ArrayMining and GEO2R are that both tools list the probes ordering them on the basis of their individual statistical level of significances with only difference of statistical methods used by them. While the latest tool GEO2R outputs either top 250 or all genes following its own ranking mechanism, ArrayMining requires number of probes to be inputted by the user. This study provided a way for automatic selection of probe-set that can be obtained from the voting of outputs resulted from statistical methods, t-Test, Mann-Whitney Test and Empirical Bayes Moderated t-test. It was also intriguing to find that the parameters of these statistical methods can be represented as a mathematical function of group fisher's discriminant ratio of a disease-control expression data-pair. Result of this fully automatic method, APT shows 88.97 percent success in comparison to 80.40 and 87.60 percent successes of ArrayMining and GEO2R respectively to include reported probes. Furthermore, out of 10 fold cross validation and 5 new test cases, APT shows a better performance than both ArrayMining and GEO2R in regards to sensitivity and specificity.


Assuntos
Biologia Computacional/métodos , Perfilação da Expressão Gênica/métodos , Modelos Estatísticos , Teorema de Bayes , Reconhecimento Automatizado de Padrão/métodos
5.
Interdiscip Sci ; 12(3): 276-287, 2020 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-32524529

RESUMO

Protein sequence is a wealth of experimental information which is yet to be exploited to extract information on protein homologues. Consequently, it is observed from publications that dynamic programming, heuristics and HMM profile-based alignment techniques along with the alignment free techniques do not directly utilize ordered profile of physicochemical properties of a protein to identify its homologue. Also, it is found that these works lack crucial bench-marking or validation in absence of which their incorporation in search engines may appears to be questionable. In this direction this research approach offers fixed dimensional numerical representation of protein sequences extending the concept of periodicity count value of nucleotide types (2017) to accommodate Euclidean distance as direct similarity measure between two proteins. Instead of bench-marking with BLAST and PSI-BLAST only, this new similarity measure was also compared with Needleman-Wunsch and Smith-Waterman. For enhancing the strength of comparison, this work for the first time introduces two novel benchmarking methods based on correlation of "similarity scores" and "proximity of ranked outputs from a standard sequence alignment method" between all possible pairs of search techniques including the new one presented in this paper. It is found that the novel and unique numerical representation of a protein can reduce computational complexity of protein sequence search to the tune of O(log(n)). It may also help implementation of various other similarity-based operation possible, such as clustering, phylogenetic analysis and classification of proteins on the basis of the properties used to build this numerical representation of protein.


Assuntos
Software , Análise por Conglomerados , Biologia Computacional/métodos , Filogenia , Análise de Sequência de Proteína/métodos
6.
BMC Struct Biol ; 18(1): 16, 2018 12 12.
Artigo em Inglês | MEDLINE | ID: mdl-30541545

RESUMO

BACKGROUND: In the backdrop of challenge to obtain a protein structure under the known limitations of both experimental and theoretical techniques, the need of a fast as well as accurate protein structure evaluation method still exists to substantially reduce a huge gap between number of known sequences and structures. Among currently practiced theoretical techniques, homology modelling backed by molecular dynamics based optimization appears to be the most popular one. However it suffers from contradictory indications of different validation parameters generated from a set of protein models which are predicted against a particular target protein. For example, in one model Ramachandran Score may be quite high making it acceptable, whereas, its potential energy may not be very low making it unacceptable and vice versa. Towards resolving this problem, the main objective of this study was fixed as to utilize a simple experimentally derived output, Surface Roughness Index of concerned protein of unknown structure as an intervening agent that could be obtained using ordinary microscopic images of heat denatured aggregates of the same protein. RESULT: It was intriguing to observe that direct experimental knowledge of the concerned protein, however simple it may be, might give insight on acceptability of its particular structural model out of a confusion set of models generated from database driven comparative technique for structure prediction. The result obtained from a widely varying structural class of proteins indicated that speed of protein structure evaluation can be further enhanced without compromising with accuracy by recruiting simple experimental output. CONCLUSION: In this work, a semi-empirical methodological approach was provided for improving protein structure evaluation. It showed that, once structure models of a protein were obtained through homology technique, the problem of selection of a best model out of a confusion set of Pareto-optimal structures could be resolved by employing a structure agent directly obtainable through experiment with the same protein as experimental ingredient. Overall, in the backdrop of getting a reasonably accurate protein structure of pathogens causing epidemics or biological warfare, such approach could be of use as a plausible solution for fast drug design.


Assuntos
Modelos Moleculares , Proteínas/química , Citocromos c/química , Hemoglobinas/química , Conformação Proteica , Albumina Sérica/química
7.
Comput Biol Chem ; 77: 28-35, 2018 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-30205354

RESUMO

Circular RNAs are new class of stable non-coding RNAs, whose expressions are specific to tissues as well as developmental stages and reported to act as gene regulators. Conspicuous presences of some of them as biomarkers for cancers, aging etc. are well reported. Biogenesis of circular RNA competes with Pre-mRNA splicing using the same splicing machinery and gene loci. Also, some circular RNAs are reported to have open reading frames and internal ribosome entry site for ribosome binding, which increases the chance of overlapping features among circular and mRNA transcripts. Therefore, discriminating the Exonic circular RNAs and mRNAs solely through sequence properties is challenging. However, possible discriminating factors, such as, reports on non-canonical arrangement of exons in circular RNAs were cited. This study was dedicated to classify Circular RNAs from mRNAs by recruiting features extracted from sequences as well as predicted secondary structures and ANN classifier models for all these feature types. The features were statistics of di-nucleotide index, emission probability of RNA sequences and entropy of di-nucleotides. Finally a simple decision voting was applied to combine decisions obtained from multiple classifiers. After performing 10 fold cross validation we obtained average values of efficiency, sensitivity, specificity and Mathews correlation coefficient as 0.8374, 0.8544, 0.8203 and 0.6753 respectively. In the backdrop of few reports of identification of circular RNAs from constitutive exons and other long non-coding RNAs, this is the first report of discriminating exonic circular RNAs from mRNAs using sequence and sequence-derived properties.


Assuntos
Biologia Computacional , Éxons , RNA/química , Sequência de Bases , Humanos , Conformação de Ácido Nucleico , RNA/genética , RNA/metabolismo , RNA Circular , Curva ROC
8.
Comput Biol Chem ; 68: 6-11, 2017 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-28213309

RESUMO

The common exercise adopted in almost all the ligand-binding sites (LBS) predictive methods is to considerably reduce the search space up to a meager fraction of the whole protein. In this exercise it is assumed that the LBS are mostly localized within a search subspace, cavities, which topologically appear to be valleys within a protein surface. Therefore, extraction of cavities is considered as a most important preprocessing step for finally predicting LBS. However, prediction of LBS based on cavity search subspace is found to fail for some proteins. To solve this problem a new search subspace was introduced which was found successful to localize LBS in most of the proteins used in this work for which cavity-based method MetaPocket 2.0 failed. Therefore this work appeared to augment well the existing binding site predictive methods through its applicability for complementary set of proteins for which cavity-based methods might fail. Also, to decide on the proteins for which instead of cavity-subspace the new subspace should be explored, a decision framework based on simple heuristic is made which uses geometric parameters of cavities extracted through MetaPocket 2.0. It is found that option for selecting the new or cavity-search subspace can be predicted correctly for nearly 87.5% of test proteins.


Assuntos
Ligantes , Proteínas/química , Sítios de Ligação
9.
Interdiscip Sci ; 9(2): 173-183, 2017 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-26825665

RESUMO

Online retrieval of the homologous nucleotide sequences through existing alignment techniques is a common practice against the given database of sequences. The salient point of these techniques is their dependence on local alignment techniques and scoring matrices the reliability of which is limited by computational complexity and accuracy. Toward this direction, this work offers a novel way for numerical representation of genes which can further help in dividing the data space into smaller partitions helping formation of a search tree. In this context, this paper introduces a 36-dimensional Periodicity Count Value (PCV) which is representative of a particular nucleotide sequence and created through adaptation from the concept of stochastic model of Kolekar et al. (American Institute of Physics 1298:307-312, 2010. doi: 10.1063/1.3516320 ). The PCV construct uses information on physicochemical properties of nucleotides and their positional distribution pattern within a gene. It is observed that PCV representation of gene reduces computational cost in the calculation of distances between a pair of genes while being consistent with the existing methods. The validity of PCV-based method was further tested through their use in molecular phylogeny constructs in comparison with that using existing sequence alignment methods.


Assuntos
Biologia Computacional/métodos , Evolução Molecular , Filogenia , Algoritmos , Sequência de Bases , Alinhamento de Sequência , Análise de Sequência de DNA/métodos
10.
Interdiscip Sci ; 9(1): 65-71, 2017 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-27878456

RESUMO

For extraction of information on binding sites of a protein, the commonly known geometry-based methods utilize the corresponding PDB file to extract its surface as a first step. Finally, the surface is used to find the binding site atoms. As shown in this paper work, since none of the mostly used surface extraction methods can retrieve a sizeable percentage of the binding site atoms, the scope of development of a better method remains. In this direction, this paper presents a new benchmarking criteria based on utilization of binding site information to compare performance of these surface extraction methods. Also, a new surface extraction method is introduced based on the use of a rotating cylinder probe adapting from the work of Weisel et al. (Chem Cent J 1:7-23, 2007. doi: 10.1186/1752-153X-1-7 ). The result of the new method shows a significant improvement of performance in comparison to the existing methods.


Assuntos
Proteínas/química , Proteínas/isolamento & purificação , Sítios de Ligação , Modelos Moleculares , Conformação Proteica
11.
Bioinformation ; 8(20): 984-7, 2012.
Artigo em Inglês | MEDLINE | ID: mdl-23275692

RESUMO

Current practice of validating predicted protein structural model is knowledge-based where scoring parameters are derived from already known structures to obtain decision on validation out of this structure information. For example, the scoring parameter, Ramachandran Score gives percentage conformity with steric-property higher value of which implies higher acceptability. On the other hand, Force-Field Energy Score gives conformity with energy-wise stability higher value of which implies lower acceptability. Naturally, setting these two scoring parameters as target objectives sometimes yields a set of multiple models for the same protein for which acceptance based on a particular parameter, say, Ramachandran score, may not satisfy well with the acceptance of the same model based on other parameter, say, energy score. The confusion set of such models can further be resolved by introducing some parameters value of which are easily obtainable through experiment on the same protein. In this piece of work it was found that the confusion regarding final acceptance of a model out of multiple models of the same protein can be removed using a parameter Surface Rough Index which can be obtained through semi-empirical method from the ordinary microscopic image of heat denatured protein.

12.
Bioinformation ; 6(6): 240-3, 2011.
Artigo em Inglês | MEDLINE | ID: mdl-21887014

RESUMO

Identification of promoter region is an important part of gene annotation. Identification of promoters in eukaryotes is important as promoters modulate various metabolic functions and cellular stress responses. In this work, a novel approach utilizing intensity values of tilling microarray data for a model eukaryotic plant Arabidopsis thaliana, was used to specify promoter region from non-promoter region. A feed-forward back propagation neural network model supported by genetic algorithm was employed to predict the class of data with a window size of 41. A dataset comprising of 2992 data vectors representing both promoter and non-promoter regions, chosen randomly from probe intensity vectors for whole genome of Arabidopsis thaliana generated through tilling microarray technique was used. The classifier model shows prediction accuracy of 69.73% and 65.36% on training and validation sets, respectively. Further, a concept of distance based class membership was used to validate reliability of classifier, which showed promising results. The study shows the usability of micro-array probe intensities to predict the promoter regions in eukaryotic genomes.

13.
Bioinformation ; 6(4): 158-61, 2011 May 07.
Artigo em Inglês | MEDLINE | ID: mdl-21572883

RESUMO

Current work targeted to predicate parametric relationship between aggregate and individual property of a protein. In this approach, we considered individual property of a protein as its Surface Roughness Index (SRI) which was shown to have potential to classify SCOP protein families. The bulk property was however considered as Intensity Level based Multi-fractal Dimension (ILMFD) of ordinary microscopic images of heat denatured protein aggregates which was known to have potential to serve as protein marker. The protocol used multiple ILMFD inputs obtained for a protein to produce a set of mapped outputs as possible SRI candidates. The outputs were further clustered and largest cluster centre after normalization was found to be a close approximation of expected SRI that was calculated from known PDB structure. The outcome showed that faster derivation of individual protein's surface property might be possible using its bulk form, heat denatured aggregates.

14.
Bioinformation ; 7(7): 320-3, 2011.
Artigo em Inglês | MEDLINE | ID: mdl-22355230

RESUMO

Study on geometric properties of nanoparticles and their relation with biomolecular activities, especially protein is quite a new field to explore. This work was carried out towards this direction where images of gold nanoparticles obtained from transmission electron microscopy were processed to extract their size and area profile at different experimental conditions including and excluding a protein, citrate synthase. Since the images were ill-posed, texture of a context-window for each pixel was used as input to a back-propagation network architecture to obtain decision on its membership as nanoparticle. The segmented images were further analysed by k-means clustering to derive geometric properties of individual nanoparticles even from their assembled form. The extracted geometric information was found to be crucial to give a model featuring porous cage like configuration of nanoparticle assembly using which the chaperone like activity of gold nanoparticles can be explained.

15.
Bioinformation ; 3(9): 384-8, 2009 Jun 13.
Artigo em Inglês | MEDLINE | ID: mdl-19707563

RESUMO

Screening of " drug-like" molecule from the molecular database produced through high throughput techniques and their large repositories requires robust classification. In our work, a set of heuristically chosen nine molecular descriptors including four from Lipinski's rule, were used as classification parameter for screening "drug-like" molecules. The robustness of classification was compared with four fundamental descriptors of Lipinski. Back propagation neural network based classifier was applied on a database of 60000 molecules for classification of, " drug-like" and "non drug-like" molecules. Classification result using nine descriptors showed high classification accuracy of 96.1% in comparison to that using four Lipinski's descriptors which yielded an accuracy of 82.48%. Also a significant decrease of false positives resulted while using nine descriptors causing a sharp 18% increase of specificity of classification. From this study it appeared that Lipinski's descriptors which mainly deal with pharmacokinetic properties of molecules form the basis for identification of "drug-like" molecules that can be substantially improved by adding more descriptors representing pharmaco-dynamics properties of molecules.

16.
Bioinformation ; 3(6): 268-74, 2009.
Artigo em Inglês | MEDLINE | ID: mdl-19255647

RESUMO

Use of knowledge based scoring function (KBSF) for virtual screening and molecular docking has become an established method for drug discovery. Lack of a precise and reliable free energy function that describes several interactions including water-mediated atomic interaction between amino-acid residues and ligand makes distance based statistical measure as the only alternative. Till now all the distance based scoring functions in KBSF arena use atom singularity concept, which neglects the environmental effect of the atom under consideration. We have developed a novel knowledge-based statistical energy function for protein-ligand complexes which takes atomic environment in to account hence functional group as a singular entity. The proposed knowledge based scoring function is fast, simple to construct, easy to use and moreover it tackle the existing problem of handling molecular orientation in active site pocket. We have designed and used Functional group based Ligand retrieval (FBLR) system which can identify and detect the orientation of functional groups in ligand. This decoy searching was used to build the above KBSF to quantify the activity and affinity of high resolution protein-ligand complexes. We have proposed the probable use of these decoys in molecular build-up as a de-novo drug designing approach. We have also discussed the possible use of the said KSBF in pharmacophore fragment detection and pseudo center based fragment alignment procedure.

17.
Bioinformation ; 2(9): 379-83, 2008 Jun 23.
Artigo em Inglês | MEDLINE | ID: mdl-18795110

RESUMO

Multi-fractal property of heat-denatured protein aggregates (HDPA) is characteristic of its individual form. The visual similarity between digitally generated microscopic images of HDPA with that of surface-image of its individual X-ray structures in protein databank (PDB) displayed using Visual Molecular Dynamics (VMD) viewer is the basis of the study. We deigned experiments to view the fractal nature of proteins at different aggregate scales. Intensity based multi-fractal dimensions (ILMFD) extracted from various planes of digital microscopic images of protein aggregates were used to characterize HDPA into different classes. Moreover, the ILMFD parameters extracted from aggregates show similar classification pattern to digital images of protein surface displayed by VMD viewer using PDB entry. We discuss the use of irregular patterns of heat-denatured aggregate proteins to understand various surface properties in native proteins.

18.
Bioinformation ; 2(3): 113-8, 2007 Dec 05.
Artigo em Inglês | MEDLINE | ID: mdl-18288335

RESUMO

Small molecules play crucial role in the modulation of biological functions by interacting with specific macromolecules. Hence small molecule interactions are captured by a variety of experimental methods to estimate and propose correlations between molecular structures to their biological activities. The tremendous expanse in publicly available small molecules is also driving new efforts to better understand interactions involving small molecules particularly in area of drug docking and pharmacogenomics. We have studied and designed a functional group identification system with the associated ontology for it. The functional group identification system can detect the functional group components from given ligand structure with specific coordinate information. Functional group ontology (FGO) proposed by us is a structured classification of chemical functional group which acts as an important source of prior knowledge that may be automatically integrated to support identification, categorization and predictive data analysis tasks. We have used a new annotation method which can be used to construct the original structure from given ontological expression using exact coordinate information. Here, we also discuss about ontology-driven similarity measure of functional groups and uses of such novel ontology for pharmacophore searching and de-novo ligand designing.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...