Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 8 de 8
Filter
Add more filters










Database
Language
Publication year range
1.
PLoS One ; 5(9)2010 Sep 29.
Article in English | MEDLINE | ID: mdl-20927376

ABSTRACT

BACKGROUND: The investigation of the interconnections between the molecular and genetic events that govern biological systems is essential if we are to understand the development of disease and design effective novel treatments. Microarray and next-generation sequencing technologies have the potential to provide this information. However, taking full advantage of these approaches requires that biological connections be made across large quantities of highly heterogeneous genomic datasets. Leveraging the increasingly huge quantities of genomic data in the public domain is fast becoming one of the key challenges in the research community today. METHODOLOGY/RESULTS: We have developed a novel data mining framework that enables researchers to use this growing collection of public high-throughput data to investigate any set of genes or proteins. The connectivity between molecular states across thousands of heterogeneous datasets from microarrays and other genomic platforms is determined through a combination of rank-based enrichment statistics, meta-analyses, and biomedical ontologies. We address data quality concerns through dataset replication and meta-analysis and ensure that the majority of the findings are derived using multiple lines of evidence. As an example of our strategy and the utility of this framework, we apply our data mining approach to explore the biology of brown fat within the context of the thousands of publicly available gene expression datasets. CONCLUSIONS: Our work presents a practical strategy for organizing, mining, and correlating global collections of large-scale genomic data to explore normal and disease biology. Using a hypothesis-free approach, we demonstrate how a data-driven analysis across very large collections of genomic data can reveal novel discoveries and evidence to support existing hypothesis.


Subject(s)
Data Mining , Databases, Genetic , Animals , Database Management Systems , Gene Expression Profiling , Humans , Meta-Analysis as Topic
2.
BMC Genomics ; 9 Suppl 2: S2, 2008 Sep 16.
Article in English | MEDLINE | ID: mdl-18831785

ABSTRACT

Structural genomics efforts contribute new protein structures that often lack significant sequence and fold similarity to known proteins. Traditional sequence and structure-based methods may not be sufficient to annotate the molecular functions of these structures. Techniques that combine structural and functional modeling can be valuable for functional annotation. FEATURE is a flexible framework for modeling and recognition of functional sites in macromolecular structures. Here, we present an overview of the main components of the FEATURE framework, and describe the recent developments in its use. These include automating training sets selection to increase functional coverage, coupling FEATURE to structural diversity generating methods such as molecular dynamics simulations and loop modeling methods to improve performance, and using FEATURE in large-scale modeling and structure determination efforts.


Subject(s)
Computational Biology/methods , Genomics/methods , Models, Molecular , Proteins/chemistry , Proteins/metabolism , Algorithms , Artificial Intelligence , Databases, Protein , Protein Conformation , Structure-Activity Relationship
3.
Proteins ; 63(4): 832-45, 2006 Jun 01.
Article in English | MEDLINE | ID: mdl-16508975

ABSTRACT

Correlated mutations have been repeatedly exploited for intramolecular contact map prediction. Over the last decade these efforts yielded several methods for measuring correlated mutations. Nevertheless, the application of correlated mutations for the prediction of intermolecular interactions has not yet been explored. This gap is due to several obstacles, such as 3D complexes availability, paralog discrimination, and the availability of sequence pairs that are required for inter- but not intramolecular analyses. Here we selected for analysis fusion protein families that bypass some of these obstacles. We find that several correlated mutation measurements yield reasonable accuracy for intramolecular contact map prediction on the fusion dataset. However, the accuracy level drops sharply in intermolecular contacts prediction. This drop in accuracy does not occur always. In the Cohesin-Dockerin family, reasonable accuracy is achieved in the prediction of both intra- and intermolecular contacts. The Cohesin-Dockerin family is well suited for correlated mutation analysis. Because, however, this family constitutes a special case (it has radical mutations, has domain repeats, within each species each Dockerin domain interacts with each Cohesin domain, see below), the successful prediction in this family does not point to a general potential in using correlated mutations for predicting intermolecular contacts. Overall, the results of our study indicate that current methodologies of correlated mutations analysis are not suitable for large-scale intermolecular contact prediction, and thus cannot assist in docking. With current measurements, sequence availability, sequence annotations, and underdeveloped sequence pairing methods, correlated mutations can yield reasonable accuracy only for a handful of families.


Subject(s)
Cell Cycle Proteins/chemistry , Cell Cycle Proteins/classification , Chromosomal Proteins, Non-Histone/chemistry , Chromosomal Proteins, Non-Histone/classification , Mutation/genetics , Nuclear Proteins/chemistry , Nuclear Proteins/classification , Recombinant Fusion Proteins/metabolism , Amino Acids/chemistry , Cell Cycle Proteins/genetics , Cell Cycle Proteins/metabolism , Chemical Phenomena , Chemistry, Physical , Chromosomal Proteins, Non-Histone/genetics , Chromosomal Proteins, Non-Histone/metabolism , Entropy , Nuclear Proteins/genetics , Nuclear Proteins/metabolism , Protein Subunits/chemistry , Protein Subunits/genetics , Protein Subunits/metabolism , Recombinant Fusion Proteins/chemistry , Recombinant Fusion Proteins/genetics , Cohesins
4.
Proteins ; 60(2): 217-23, 2005 Aug 01.
Article in English | MEDLINE | ID: mdl-15981251

ABSTRACT

The last 3 rounds (3-5) of CAPRI included a wide range of docking targets. Several targets were especially challenging, since they involved large-scale movements and symmetric rearrangement, while others were based on homology models. We have approached the targets with a variety of geometry-based docking algorithms that include rigid docking, symmetric docking, and flexible docking with symmetry constraints. For all but 1 docking target, we were able to submit at least 1 acceptable quality prediction. Here, we detail for each target the prediction methods used and the specific biological data employed, and supply a retrospective analysis of the results. We highlight the advantages of our techniques, which efficiently exploit the geometric shape complementarity properties of the interaction. These enable them to run only few minutes on a standard PC even for flexible docking, thus proving their scalability toward computational genomic scale experiments. We also outline the major required enhancements, such as the introduction of side-chain position refinement and the introduction of flexibility for both docking partners.


Subject(s)
Computational Biology/methods , Protein Interaction Mapping/methods , Proteomics/methods , Algorithms , Computer Simulation , Crystallography, X-Ray , Databases, Protein , Dimerization , Internet , Macromolecular Substances , Models, Molecular , Models, Statistical , Models, Theoretical , Molecular Conformation , Mutation , Protein Conformation , Protein Folding , Protein Structure, Tertiary , Reproducibility of Results , Software , Static Electricity , Structural Homology, Protein
5.
Structure ; 12(6): 1027-38, 2004 Jun.
Article in English | MEDLINE | ID: mdl-15274922

ABSTRACT

Hot spot residues contribute dominantly to protein-protein interactions. Statistically, conserved residues correlate with hot spots, and their occurrence can distinguish between binding sites and the remainder of the protein surface. The hot spot and conservation analyses have been carried out on one side of the interface. Here, we show that both experimental hot spots and conserved residues tend to couple across two-chain interfaces. Intriguingly, the local packing density around both hot spots and conserved residues is higher than expected. We further observe a correlation between local packing density and experimental deltadeltaG. Favorable conserved pairs include Gly coupled with aromatics, charged and polar residues, as well as aromatic residue coupling. Remarkably, charged residue couples are underrepresented. Overall, protein-protein interactions appear to consist of regions of high and low packing density, with the hot spots organized in the former. The high local packing density in binding interfaces is reminiscent of protein cores.


Subject(s)
Protein Binding , Alanine/chemistry , Amino Acid Motifs , Amino Acids/chemistry , Animals , Databases as Topic , Glycine/chemistry , Glycoproteins/chemistry , Mice , Models, Molecular , Multigene Family , Mutation , Protein Conformation , Thermodynamics
6.
Protein Sci ; 12(7): 1344-59, 2003 Jul.
Article in English | MEDLINE | ID: mdl-12824481

ABSTRACT

Phage display enables the presentation of a large number of peptides on the surface of phage particles. Such libraries can be tested for binding to target molecules of interest by means of affinity selection. Here we present SiteLight, a novel computational tool for binding site prediction using phage display libraries. SiteLight is an algorithm that maps the 1D peptide library onto a three-dimensional (3D) protein surface. It is applicable to complexes made up of a protein Template and any type of molecule termed Target. Given the three-dimensional structure of a Template and a collection of sequences derived from biopanning against the Target, the Template interaction site with the Target is predicted. We have created a large diverse data set for assessing the ability of SiteLight to correctly predict binding sites. SiteLight predictive mapping enables discrimination between the binding and nonbinding parts of the surface. This prediction can be used to effectively reduce the surface by 75% without excluding the binding site. In 63% of the cases we have tested, there is at least one binding site prediction that overlaps the interface by at least 50%. These results suggest the applicability of phage display libraries for automated binding site prediction on three-dimensional structures. For most effective binding site prediction we propose using a random phage display library twice, to scan both binding partners of a given complex. The derived peptides are mapped to the other binding partner (now used as a Template). Here, the surface of each partner is reduced by 75%, focusing their relative positions with respect to each other significantly. Such information can be utilized to improve docking algorithms and scoring functions.


Subject(s)
Bacteriophages/genetics , Peptide Library , Proteins/genetics , Algorithms , Animals , Bacteriophages/chemistry , Binding Sites , Carrier Proteins/chemistry , Combinatorial Chemistry Techniques , DNA-Binding Proteins , Databases, Protein , HSC70 Heat-Shock Proteins , HSP70 Heat-Shock Proteins/chemistry , HSP70 Heat-Shock Proteins/genetics , Humans , Peptide Fragments/chemistry , Peptide Fragments/isolation & purification , Protein Conformation , Proteins/chemistry , Proteins/classification , Recombinant Fusion Proteins/chemistry , Recombinant Fusion Proteins/genetics , Reproducibility of Results , Transcription Factors
7.
Proteins ; 52(1): 107-12, 2003 Jul 01.
Article in English | MEDLINE | ID: mdl-12784375

ABSTRACT

We present a very efficient rigid "unbound" soft docking methodology, which is based on detection of geometric shape complementarity, allowing liberal steric clash at the interface. The method is based on local shape feature matching, avoiding the exhaustive search of the 6D transformation space. Our experiments at CAPRI rounds 1 and 2 show that although the method does not perform an exhaustive search of the 6D transformation space, the "correct" solution is never lost. However, such a solution might rank low for large proteins, because there are alternatives with significantly larger geometrically compatible interfaces. In many cases this problem can be resolved by successful a priori focusing on the vicinity of potential binding sites as well as the extension of the technique to flexible (hinge-bent) docking. This is demonstrated in the experiments performed as a lesson from our CAPRI experience.


Subject(s)
Algorithms , Antigens, Viral , Models, Molecular , Proteins/chemistry , Proteins/metabolism , Antibodies/chemistry , Antibodies/immunology , Bacterial Proteins/chemistry , Bacterial Proteins/metabolism , Binding Sites , Capsid Proteins/chemistry , Capsid Proteins/immunology , Exotoxins/chemistry , Exotoxins/metabolism , Hemagglutinin Glycoproteins, Influenza Virus/chemistry , Hemagglutinin Glycoproteins, Influenza Virus/immunology , Macromolecular Substances , Membrane Proteins/chemistry , Membrane Proteins/metabolism , Phosphoenolpyruvate Sugar Phosphotransferase System/chemistry , Phosphoenolpyruvate Sugar Phosphotransferase System/metabolism , Protein Interaction Mapping , Protein Serine-Threonine Kinases/chemistry , Protein Serine-Threonine Kinases/metabolism , Receptors, Antigen, T-Cell, alpha-beta/chemistry , Receptors, Antigen, T-Cell, alpha-beta/metabolism , alpha-Amylases/chemistry , alpha-Amylases/metabolism
8.
Proteins ; 47(4): 409-43, 2002 Jun 01.
Article in English | MEDLINE | ID: mdl-12001221

ABSTRACT

The docking field has come of age. The time is ripe to present the principles of docking, reviewing the current state of the field. Two reasons are largely responsible for the maturity of the computational docking area. First, the early optimism that the very presence of the "correct" native conformation within the list of predicted docked conformations signals a near solution to the docking problem, has been replaced by the stark realization of the extreme difficulty of the next scoring/ranking step. Second, in the last couple of years more realistic approaches to handling molecular flexibility in docking schemes have emerged. As in folding, these derive from concepts abstracted from statistical mechanics, namely, populations. Docking and folding are interrelated. From the purely physical standpoint, binding and folding are analogous processes, with similar underlying principles. Computationally, the tools developed for docking will be tremendously useful for folding. For large, multidomain proteins, domain docking is probably the only rational way, mimicking the hierarchical nature of protein folding. The complexity of the problem is huge. Here we divide the computational docking problem into its two separate components. As in folding, solving the docking problem involves efficient search (and matching) algorithms, which cover the relevant conformational space, and selective scoring functions, which are both efficient and effectively discriminate between native and non-native solutions. It is universally recognized that docking of drugs is immensely important. However, protein-protein docking is equally so, relating to recognition, cellular pathways, and macromolecular assemblies. Proteins function when they are bound to other molecules. Consequently, we present the review from both the computational and the biological points of view. Although large, it covers only partially the extensive body of literature, relating to small (drug) and to large protein-protein molecule docking, to rigid and to flexible. Unfortunately, when reviewing these, a major difficulty in assessing the results is the non-uniformity in the formats in which they are presented in the literature. Consequently, we further propose a way to rectify it here.


Subject(s)
Algorithms , Computational Biology/methods , Proteins/chemistry , Proteins/metabolism , Animals , Binding Sites , DNA/metabolism , Drug Design , Hydrogen Bonding , Ligands , Macromolecular Substances , Models, Theoretical , Protein Conformation , Static Electricity
SELECTION OF CITATIONS
SEARCH DETAIL
...