Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 14 de 14
Filter
Add more filters










Publication year range
1.
ArXiv ; 2021 May 07.
Article in English | MEDLINE | ID: mdl-33972927

ABSTRACT

With recent advances in sequencing technology it has become affordable and practical to sequence genomes to very high depth-of-coverage, allowing researchers to discover low-frequency variants in the genome. However, due to the errors in sequencing it is an active area of research to develop algorithms that can separate noise from the true variants. LoFreq is a state of the art algorithm for low-frequency variant detection but has a relatively long runtime compared to other tools. In addition to this, the interface for running in parallel could be simplified, allowing for multithreading as well as distributing jobs to a cluster. In this work we describe some specific contributions to LoFreq that remedy these issues.

2.
BMC Syst Biol ; 10 Suppl 2: 49, 2016 08 01.
Article in English | MEDLINE | ID: mdl-27490494

ABSTRACT

BACKGROUND: Simulating protein folding motions is an important problem in computational biology. Motion planning algorithms, such as Probabilistic Roadmap Methods, have been successful in modeling the folding landscape. Probabilistic Roadmap Methods and variants contain several phases (i.e., sampling, connection, and path extraction). Most of the time is spent in the connection phase and selecting which variant to employ is a difficult task. Global machine learning has been applied to the connection phase but is inefficient in situations with varying topology, such as those typical of folding landscapes. RESULTS: We develop a local learning algorithm that exploits the past performance of methods within the neighborhood of the current connection attempts as a basis for learning. It is sensitive not only to different types of landscapes but also to differing regions in the landscape itself, removing the need to explicitly partition the landscape. We perform experiments on 23 proteins of varying secondary structure makeup with 52-114 residues. We compare the success rate when using our methods and other methods. We demonstrate a clear need for learning (i.e., only learning methods were able to validate against all available experimental data) and show that local learning is superior to global learning producing, in many cases, significantly higher quality results than the other methods. CONCLUSIONS: We present an algorithm that uses local learning to select appropriate connection methods in the context of roadmap construction for protein folding. Our method removes the burden of deciding which method to use, leverages the strengths of the individual input methods, and it is extendable to include other future connection methods.


Subject(s)
Computational Biology/methods , Machine Learning , Protein Folding , Proteins/chemistry , Models, Molecular , Movement , Protein Conformation , Proteins/metabolism , Thermodynamics
3.
J Comput Biol ; 22(9): 823-36, 2015 Sep.
Article in English | MEDLINE | ID: mdl-26258648

ABSTRACT

Predicting protein structures and simulating protein folding are two of the most important problems in computational biology today. Simulation methods rely on a scoring function to distinguish the native structure (the most energetically stable) from non-native structures. Decoy databases are collections of non-native structures used to test and verify these functions. We present a method to evaluate and improve the quality of decoy databases by adding novel structures and removing redundant structures. We test our approach on 20 different decoy databases of varying size and type and show significant improvement across a variety of metrics. We also test our improved databases on two popular modern scoring functions and show that for most cases they contain a greater or equal number of native-like structures than the original databases, thereby producing a more rigorous database for testing scoring functions.


Subject(s)
Computational Biology/methods , Databases, Protein , Protein Folding , Proteins/chemistry , Algorithms , Computer Simulation , Protein Conformation
4.
Clin Nurs Res ; 18(4): 291-306, 2009 Nov.
Article in English | MEDLINE | ID: mdl-19741240

ABSTRACT

Women frequently fail to recognize that coronary heart disease (CHD), not breast cancer, is the primary cause of female mortality. CHD mortality among U.S. mainland Puerto Rican (PR) women is second only to African American women. It is unknown what PR women understand about their risk, what factors they believe contribute to CHD, or whether they know the atypical symptoms often experienced by women. Most CHD studies exclude Hispanic women. Those that do often aggregate their results, making subgroup variations invisible. This study explored awareness of CHD symptoms, risks, and help-seeking behaviors among 12 PR women. Focus group methodology revealed that participants were unaware of their risk and had misconceptions about CHD symptoms and contributing factors. Barriers to early recognition and treatment included lack of knowledge, gender role conflict (caregiver vs. care recipient), and fears of falsely alarming family members or the embarrassment of feeling "dismissed" by health care providers.


Subject(s)
Attitude to Health/ethnology , Coronary Disease , Health Education , Hispanic or Latino/psychology , Transcultural Nursing/methods , Adult , Coronary Disease/ethnology , Coronary Disease/nursing , Coronary Disease/psychology , Culture , Female , Hispanic or Latino/statistics & numerical data , Humans , Middle Aged , Risk Factors
5.
J Mol Biol ; 381(4): 1055-67, 2008 Sep 12.
Article in English | MEDLINE | ID: mdl-18639245

ABSTRACT

We present a general computational approach to simulate RNA folding kinetics that can be used to extract population kinetics, folding rates and the formation of particular substructures that might be intermediates in the folding process. Simulating RNA folding kinetics can provide unique insight into RNA whose functions are dictated by folding kinetics and not always by nucleotide sequence or the structure of the lowest free-energy state. The method first builds an approximate map (or model) of the folding energy landscape from which the population kinetics are analyzed by solving the master equation on the map. We present results obtained using an analysis technique, map-based Monte Carlo simulation, which stochastically extracts folding pathways from the map. Our method compares favorably with other computational methods that begin with a comprehensive free-energy landscape, illustrating that the smaller, approximate map captures the major features of the complete energy landscape. As a result, our method scales to larger RNAs. For example, here we validate kinetics of RNA of more than 200 nucleotides. Our method accurately computes the kinetics-based functional rates of wild-type and mutant ColE1 RNAII and MS2 phage RNAs showing excellent agreement with experiment.


Subject(s)
Computer Simulation , Nucleic Acid Conformation , RNA/chemistry , RNA/metabolism , Animals , Base Sequence , Kinetics , Molecular Sequence Data , RNA/genetics , RNA, Spliced Leader/chemistry , RNA, Spliced Leader/genetics , Reproducibility of Results , Thermodynamics , Time Factors , Trypanosomatina
6.
J Comput Biol ; 14(6): 839-55, 2007.
Article in English | MEDLINE | ID: mdl-17691897

ABSTRACT

Protein motions, ranging from molecular flexibility to large-scale conformational change, play an essential role in many biochemical processes. Despite the explosion in our knowledge of structural and functional data, our understanding of protein movement is still very limited. In previous work, we developed and validated a motion planning based method for mapping protein folding pathways from unstructured conformations to the native state. In this paper, we propose a novel method based on rigidity theory to sample conformation space more effectively, and we describe extensions of our framework to automate the process and to map transitions between specified conformations. Our results show that these additions both improve the accuracy of our maps and enable us to study a broader range of motions for larger proteins. For example, we show that rigidity-based sampling results in maps that capture subtle folding differences between protein G and its mutants, NuG1 and NuG2, and we illustrate how our technique can be used to study large-scale conformational changes in calmodulin, a 148 residue signaling protein known to undergo conformational changes when binding to Ca(2+). Finally, we announce our web-based protein folding server which includes a publicly available archive of protein motions: (http://parasol.tamu.edu/foldingserver/).


Subject(s)
Calmodulin/chemistry , Computational Biology , GTP-Binding Proteins/chemistry , Calmodulin/metabolism , Computer Simulation , GTP-Binding Proteins/genetics , GTP-Binding Proteins/metabolism , Models, Molecular , Models, Statistical , Protein Conformation , Protein Folding , Protein Structure, Secondary , Thermodynamics
7.
Bioinformatics ; 23(13): i539-48, 2007 Jul 01.
Article in English | MEDLINE | ID: mdl-17646341

ABSTRACT

MOTIVATION: Protein motions play an essential role in many biochemical processes. Lab studies often quantify these motions in terms of their kinetics such as the speed at which a protein folds or the population of certain interesting states like the native state. Kinetic metrics give quantifiable measurements of the folding process that can be compared across a group of proteins such as a wild-type protein and its mutants. RESULTS: We present two new techniques, map-based master equation solution and map-based Monte Carlo simulation, to study protein kinetics through folding rates and population kinetics from approximate folding landscapes, models called maps. From these two new techniques, interesting metrics that describe the folding process, such as reaction coordinates, can also be studied. In this article we focus on two metrics, formation of helices and structure formation around tryptophan residues. These two metrics are often studied in the lab through circular dichroism (CD) spectra analysis and tryptophan fluorescence experiments, respectively. The approximated landscape models we use here are the maps of protein conformations and their associated transitions that we have presented and validated previously. In contrast to other methods such as the traditional master equation and Monte Carlo simulation, our techniques are both fast and can easily be computed for full-length detailed protein models. We validate our map-based kinetics techniques by comparing folding rates to known experimental results. We also look in depth at the population kinetics, helix formation and structure near tryptophan residues for a variety of proteins. AVAILABILITY: We invite the community to help us enrich our publicly available database of motions and kinetics analysis by submitting to our server: http://parasol.tamu.edu/foldingserver/.


Subject(s)
Algorithms , Models, Chemical , Protein Folding , Proteins/chemistry , Proteins/ultrastructure , Sequence Analysis, Protein/methods , Computer Simulation , Kinetics , Models, Molecular , Motion
9.
Phys Biol ; 2(4): S148-55, 2005 Nov 09.
Article in English | MEDLINE | ID: mdl-16280620

ABSTRACT

We investigate a novel approach for studying protein folding that has evolved from robotics motion planning techniques called probabilistic roadmap methods (PRMs). Our focus is to study issues related to the folding process, such as the formation of secondary and tertiary structures, assuming we know the native fold. A feature of our PRM-based framework is that the large sets of folding pathways in the roadmaps it produces, in just a few hours on a desktop PC, provide global information about the protein's energy landscape. This is an advantage over other simulation methods such as molecular dynamics or Monte Carlo methods which require more computation and produce only a single trajectory in each run. In our initial studies, we obtained encouraging results for several small proteins. In this paper, we investigate more sophisticated techniques for analyzing the folding pathways in our roadmaps. In addition to more formally revalidating our previous results, we present a case study showing that our technique captures known folding differences between the structurally similar proteins G and L.


Subject(s)
Biophysics/methods , Computational Biology/methods , Protein Folding , Animals , Computer Simulation , Humans , Models, Biological , Models, Theoretical , Monte Carlo Method , Motion , Probability , Protein Conformation , Protein Structure, Secondary , Software , Thermodynamics
10.
J Comput Biol ; 12(6): 862-81, 2005.
Article in English | MEDLINE | ID: mdl-16108722

ABSTRACT

We propose a novel, motion planning based approach to approximately map the energy landscape of an RNA molecule. A key feature of our method is that it provides a sparse map that captures the main features of the energy landscape which can be analyzed to compute folding kinetics. Our method is based on probabilistic roadmap motion planners that we have previously successfully applied to protein folding. In this paper, we provide evidence that this approach is also well suited to RNA. We compute population kinetics and transition rates on our roadmaps using the master equation for a few moderately sized RNA and show that our results compare favorably with results of other existing methods.


Subject(s)
Computational Biology , Models, Biological , Models, Chemical , Nucleic Acid Conformation , RNA/chemistry , RNA/metabolism , Thermodynamics , Kinetics
11.
IEEE Trans Syst Man Cybern B Cybern ; 34(2): 912-24, 2004 Apr.
Article in English | MEDLINE | ID: mdl-15376839

ABSTRACT

This paper presents a generalized framework for dynamic simulation realized in a prototype simulator called the Interactive Generalized Motion Simulator (I-GMS), which can simulate motions of multirigid-body systems with contact interaction in virtual environments. I-GMS is designed to meet two important goals: generality and interactivity. By generality, we mean a dynamic simulator which can easily support various systems of rigid bodies, ranging from a single free-flying rigid object to complex linkages such as those needed for robotic systems or human body simulation. To provide this generality, we have developed I-GMS in an object-oriented framework. The user interactivity is supported through a haptic interface for articulated bodies, introducing interactive dynamic simulation schemes. This user-interaction is achieved by performing push and pull operations via the PHANToM haptic device, which runs as an integrated part of I-GMS. Also, a hybrid scheme was used for simulating internal contacts (between bodies in the multirigid-body system) in the presence of friction, which could avoid the nonexistent solution problem often faced when solving contact problems with Coulomb friction. In our hybrid scheme, two impulse-based methods are exploited so that different methods are applied adaptively, depending on whether the current contact situation is characterized as "bouncing" or "steady." We demonstrate the user-interaction capability of I-GMS through on-line editing of trajectories of a 6-degree of freedom (dof) articulated structure.


Subject(s)
Algorithms , Computer Simulation , Joints/physiology , Models, Biological , Movement/physiology , Robotics/methods , User-Computer Interface , Biomechanical Phenomena/methods , Humans , Nonlinear Dynamics
12.
J Comput Biol ; 10(3-4): 239-55, 2003.
Article in English | MEDLINE | ID: mdl-12935327

ABSTRACT

We investigate a novel approach for studying the kinetics of protein folding. Our framework has evolved from robotics motion planning techniques called probabilistic roadmap methods (PRMs) that have been applied in many diverse fields with great success. In our previous work, we presented our PRM-based technique and obtained encouraging results studying protein folding pathways for several small proteins. In this paper, we describe how our motion planning framework can be used to study protein folding kinetics. In particular, we present a refined version of our PRM-based framework and describe how it can be used to produce potential energy landscapes, free energy landscapes, and many folding pathways all from a single roadmap which is computed in a few hours on a desktop PC. Results are presented for 14 proteins. Our ability to produce large sets of unrelated folding pathways may potentially provide crucial insight into some aspects of folding kinetics, such as proteins that exhibit both two-state and three-state kinetics that are not captured by other theoretical techniques.


Subject(s)
Computational Biology , Protein Folding , Kinetics
13.
Pac Symp Biocomput ; : 240-51, 2003.
Article in English | MEDLINE | ID: mdl-12603032

ABSTRACT

We investigate a novel approach for studying protein folding that has evolved from robotics motion planning techniques called probabilistic roadmap methods (PRMS). Our focus is to study issues related to the folding process, such as the formation of secondary and tertiary structure, assuming we know the native fold. A feature of our PRM-based framework is that the large sets of folding pathways in the roadmaps it produces, in a few hours on a desktop PC, provide global information about the protein's energy landscape. This is an advantage over other simulation methods such as molecular dynamics or Monte Carlo methods which require more computation and produce only a single trajectory in each run. In our initial studies, we obtained encouraging results for several small proteins. In this paper, we investigate more sophisticated techniques for analyzing the folding pathways in our roadmaps. In addition to more formally revalidating our previous results, we present a case study showing our technique captures known folding differences between the structurally similar proteins G and L.


Subject(s)
Bacterial Proteins , Models, Molecular , Protein Folding , Computer Simulation , DNA-Binding Proteins/chemistry , Models, Statistical , Monte Carlo Method , Nerve Tissue Proteins/chemistry , Protein Structure, Secondary , Thermodynamics
14.
J Comput Biol ; 9(2): 149-68, 2002.
Article in English | MEDLINE | ID: mdl-12015875

ABSTRACT

We present a framework for studying protein folding pathways and potential landscapes which is based on techniques recently developed in the robotics motion planning community. Our focus in this work is to study the protein folding mechanism assuming we know the native fold. That is, instead of performing fold prediction, we aim to study issues related to the folding process, such as the formation of secondary and tertiary structure, and the dependence of the folding pathway on the initial denatured conformation. Our work uses probabilistic roadmap (PRM) motion planning techniques which have proven successful for problems involving high-dimensional configuration spaces. A strength of these methods is their efficiency in rapidly covering the planning space without becoming trapped in local minima. We have applied our PRM technique to several small proteins (~60 residues) and validated the pathways computed by comparing the secondary structure formation order on our paths to known hydrogen exchange experimental results. An advantage of the PRM framework over other simulation methods is that it enables one to easily and efficiently compute folding pathways from any denatured starting state to the (known) native fold. This aspect makes our approach ideal for studying global properties of the protein's potential landscape, most of which are difficult to simulate and study with other methods. For example, in the proteins we study, the folding pathways starting from different denatured states sometimes share common portions when they are close to the native fold, and moreover, the formation order of the secondary structure appears largely independent of the starting denatured conformation. Another feature of our technique is that the distribution of the sampled conformations is correlated with the formation of secondary structure and, in particular, appears to differentiate situations in which secondary structure clearly forms first and those in which the tertiary structure is obtained more directly. Overall, our results applying PRM techniques are very encouraging and indicate the promise of our approach for studying proteins for which experimental results are not available.


Subject(s)
Computational Biology , Protein Folding , Models, Molecular , Models, Statistical , Protein Conformation , Protein Structure, Secondary , Thermodynamics
SELECTION OF CITATIONS
SEARCH DETAIL
...