Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 9 de 9
Filter
1.
Methods Mol Biol ; 1851: 183-214, 2019.
Article in English | MEDLINE | ID: mdl-30298398

ABSTRACT

For highly divergent sequences, there is often insufficient information to reliably construct alignments and phylogenetic trees. Since protein structure may be strongly conserved despite large divergences in sequence, structural information can be used to help identify homology in such cases.While there exist well-studied models of sequence evolution, structurally informed alignment methods have typically made use of geometric measures of deviation that do not take into account the underlying mutational processes. In order to integrate structural information into sequence-based evolutionary models, we recently developed a stochastic model of structural evolution on a phylogenetic tree and implemented this as the StructAlign plugin for the StatAlign statistical alignment package.In this chapter, we will outline the types of analyses that can be carried out using StructAlign, illustrating how the inclusion of structural information can be used to inform joint estimation of alignments and trees. StructAlign can also be used to infer branch-specific rates of structural evolution, and analysis of an example globin dataset highlights strong variation in the inferred rate across the tree. While structure is more highly conserved within clades, the rate of structural divergence as a function of sequence variation is larger between functionally divergent proteins. Allowing for the rate of structural divergence to vary over the tree results in an improved fit to the empirically observed pairwise RMSD values.


Subject(s)
Proteins/chemistry , Sequence Alignment/methods , Algorithms , Bayes Theorem , Computational Biology/methods , Evolution, Molecular , Models, Statistical , Phylogeny , Proteins/classification , Software
2.
Methods Mol Biol ; 1851: E1, 2019.
Article in English | MEDLINE | ID: mdl-30578527

ABSTRACT

The published version of this book included errors in code listings in Chapter 10. These code listings have been corrected and text has been updated.

3.
Nat Methods ; 13(3): 241-4, 2016 Mar.
Article in English | MEDLINE | ID: mdl-26780092

ABSTRACT

The transcriptional state of a cell reflects a variety of biological factors, from cell-type-specific features to transient processes such as the cell cycle, all of which may be of interest. However, identifying such aspects from noisy single-cell RNA-seq data remains challenging. We developed pathway and gene set overdispersion analysis (PAGODA) to resolve multiple, potentially overlapping aspects of transcriptional heterogeneity by testing gene sets for coordinated variability among measured cells.


Subject(s)
Gene Expression Profiling/methods , Proteome/metabolism , Sequence Analysis, RNA/methods , Signal Transduction/physiology , Transcription, Genetic/physiology , Transcriptome/physiology , Animals , Cells, Cultured , Computer Simulation , Mice , Models, Biological , Models, Statistical , Neurons/physiology , Proteome/chemistry
4.
PLoS One ; 10(8): e0129668, 2015.
Article in English | MEDLINE | ID: mdl-26247465

ABSTRACT

BACKGROUND: Recent genomic information has revealed that neuroglobin and cytoglobin are the two principal lineages of vertebrate hemoglobins, with the latter encompassing the familiar myoglobin and α-globin/ß-globin tetramer hemoglobin, and several minor groups. In contrast, very little is known about hemoglobins in echinoderms, a phylum of exclusively marine organisms closely related to vertebrates, beyond the presence of coelomic hemoglobins in sea cucumbers and brittle stars. We identified about 50 hemoglobins in sea urchin, starfish and sea cucumber genomes and transcriptomes, and used Bayesian inference to carry out a molecular phylogenetic analysis of their relationship to vertebrate sequences, specifically, to assess the hypothesis that the neuroglobin and cytoglobin lineages are also present in echinoderms. RESULTS: The genome of the sea urchin Strongylocentrotus purpuratus encodes several hemoglobins, including a unique chimeric 14-domain globin, 2 androglobin isoforms and a unique single androglobin domain protein. Other strongylocentrotid genomes appear to have similar repertoires of globin genes. We carried out molecular phylogenetic analyses of 52 hemoglobins identified in sea urchin, brittle star and sea cucumber genomes and transcriptomes, using different multiple sequence alignment methods coupled with Bayesian and maximum likelihood approaches. The results demonstrate that there are two major globin lineages in echinoderms, which are related to the vertebrate neuroglobin and cytoglobin lineages. Furthermore, the brittle star and sea cucumber coelomic hemoglobins appear to have evolved independently from the cytoglobin lineage, similar to the evolution of erythroid oxygen binding globins in cyclostomes and vertebrates. CONCLUSION: The presence of echinoderm globins related to the vertebrate neuroglobin and cytoglobin lineages suggests that the split between neuroglobins and cytoglobins occurred in the deuterostome ancestor shared by echinoderms and vertebrates.


Subject(s)
Echinodermata/genetics , Globins/genetics , Nerve Tissue Proteins/genetics , Animals , Bayes Theorem , Cytoglobin , Echinodermata/chemistry , Globins/chemistry , Likelihood Functions , Models, Molecular , Nerve Tissue Proteins/chemistry , Neuroglobin , Phylogeny , Protein Conformation
5.
BMC Bioinformatics ; 16: 108, 2015 Apr 01.
Article in English | MEDLINE | ID: mdl-25888064

ABSTRACT

BACKGROUND: A standard procedure in many areas of bioinformatics is to use a single multiple sequence alignment (MSA) as the basis for various types of analysis. However, downstream results may be highly sensitive to the alignment used, and neglecting the uncertainty in the alignment can lead to significant bias in the resulting inference. In recent years, a number of approaches have been developed for probabilistic sampling of alignments, rather than simply generating a single optimum. However, this type of probabilistic information is currently not widely used in the context of downstream inference, since most existing algorithms are set up to make use of a single alignment. RESULTS: In this work we present a framework for representing a set of sampled alignments as a directed acyclic graph (DAG) whose nodes are alignment columns; each path through this DAG then represents a valid alignment. Since the probabilities of individual columns can be estimated from empirical frequencies, this approach enables sample-based estimation of posterior alignment probabilities. Moreover, due to conditional independencies between columns, the graph structure encodes a much larger set of alignments than the original set of sampled MSAs, such that the effective sample size is greatly increased. CONCLUSIONS: The alignment DAG provides a natural way to represent a distribution in the space of MSAs, and allows for existing algorithms to be efficiently scaled up to operate on large sets of alignments. As an example, we show how this can be used to compute marginal probabilities for tree topologies, averaging over a very large number of MSAs. This framework can also be used to generate a statistically meaningful summary alignment; example applications show that this summary alignment is consistently more accurate than the majority of the alignment samples, leading to improvements in downstream tree inference. Implementations of the methods described in this article are available at http://statalign.github.io/WeaveAlign .


Subject(s)
Algorithms , Computational Biology/methods , Computer Graphics , Models, Statistical , Sequence Alignment/methods , Software , Computer Simulation , Humans , Uncertainty
6.
Mol Biol Evol ; 31(9): 2251-66, 2014 Sep.
Article in English | MEDLINE | ID: mdl-24899668

ABSTRACT

For sequences that are highly divergent, there is often insufficient information to infer accurate alignments, and phylogenetic uncertainty may be high. One way to address this issue is to make use of protein structural information, since structures generally diverge more slowly than sequences. In this work, we extend a recently developed stochastic model of pairwise structural evolution to multiple structures on a tree, analytically integrating over ancestral structures to permit efficient likelihood computations under the resulting joint sequence-structure model. We observe that the inclusion of structural information significantly reduces alignment and topology uncertainty, and reduces the number of topology and alignment errors in cases where the true trees and alignments are known. In some cases, the inclusion of structure results in changes to the consensus topology, indicating that structure may contain additional information beyond that which can be obtained from sequences. We use the model to investigate the order of divergence of cytoglobins, myoglobins, and hemoglobins and observe a stabilization of phylogenetic inference: although a sequence-based inference assigns significant posterior probability to several different topologies, the structural model strongly favors one of these over the others and is more robust to the choice of data set.


Subject(s)
Bayes Theorem , Computational Biology/methods , Globins/chemistry , Hemoglobins/chemistry , Myoglobin/chemistry , Animals , Cytoglobin , Globins/genetics , Hemoglobins/genetics , Humans , Markov Chains , Models, Molecular , Mutation , Myoglobin/genetics , Phylogeny , Protein Conformation , Sequence Alignment , Sequence Analysis, Protein
7.
Article in English | MEDLINE | ID: mdl-23122395

ABSTRACT

The interest in early detection strategies for lysosomal storage disorders (LSDs) in newborns and high-risk population has increased in the last years due to the availability of novel treatment strategies coupled with the development of diagnostic techniques. We report the development of a short-incubation mass spectrometry-based protocol that allows the detection of Gaucher, Niemann-Pick A/B, Pompe, Fabry and mucopolysaccharidosis type I disease within 4h including sample preparation from dried blood spots. Optimized sample handling without the need of time-consuming offline preparations, such as liquid-liquid and solid-phase extraction, allows the simultaneous quantification of five lysosomal enzyme activities using a cassette of substrates and deuterated internal standards. Applying incubation times of 3h revealed in intra-day CV% values ranging from 4% to 11% for all five enzyme activities, respectively. In a first clinical evaluation, we tested 825 unaffected newborns and 16 patients with LSDs using a multiplexed, turbulent flow chromatography-ultra high performance liquid chromatography-tandem mass spectrometer assay. All affected patients were identified accurately and could be differentiated from non-affected newborns. In comparison to previously published two-day assays, which included an overnight incubation, this protocol enabled the detection of lysosomal enzyme activities from sample to first result within half a day.


Subject(s)
Lysosomal Storage Diseases/diagnosis , Neonatal Screening/methods , Tandem Mass Spectrometry/methods , Chromatography, High Pressure Liquid/methods , Dried Blood Spot Testing/methods , Drug Stability , Enzyme Assays/methods , High-Throughput Screening Assays/methods , Humans , Infant, Newborn , Liquid-Liquid Extraction , Lysosomal Storage Diseases/blood , Lysosomal Storage Diseases/enzymology , Reproducibility of Results
8.
Clin Chem ; 57(9): 1286-94, 2011 Sep.
Article in English | MEDLINE | ID: mdl-21771947

ABSTRACT

BACKGROUND: Interest in lysosomal storage disorders, a collection of more than 40 inherited metabolic disorders, has increased because of new therapy options such as enzyme replacement, stem cell transplantation, and substrate reduction therapy. We developed a high-throughput protocol that simplifies analytical challenges such as complex sample preparation and potential interference from excess residual substrate associated with previously reported assays. METHODS: After overnight incubation (16-20 h) of dried blood spots with a cassette of substrates and deuterated internal standards, we used a TLX-2 system to quantify 6 lysosomal enzyme activities for Fabry, Gaucher, Niemann-Pick A/B, Pompe, Krabbe, and mucopolysaccharidosis I disease. This multiplexed, multidimensional ultra-HPLC-tandem mass spectrometry assay included Cyclone P Turbo Flow and Hypersil Gold C8 columns. The method did not require offline sample preparation such as liquid-liquid and solid-phase extraction, or hazardous reagents such as ethyl acetate. RESULTS: Obviating the offline sample preparation steps led to substantial savings in analytical time (approximately 70%) and reagent costs (approximately 50%). In a pilot study, lysosomal enzyme activities of 8586 newborns were measured, including 51 positive controls, and the results demonstrated 100% diagnostic sensitivity and high specificity. The results for Krabbe disease were validated with parallel measurements by the New York State Screening Laboratory. CONCLUSIONS: Turboflow online sample cleanup and the use of an additional analytical column enabled the implementation of lysosomal storage disorder testing in a nationwide screening program while keeping the total analysis time to <2 min per sample.


Subject(s)
Clinical Protocols , Lysosomal Storage Diseases/diagnosis , Neonatal Screening/methods , Chromatography, High Pressure Liquid/methods , Fabry Disease/diagnosis , Gaucher Disease/diagnosis , Glycogen Storage Disease Type II/diagnosis , Humans , Infant, Newborn , Leukodystrophy, Globoid Cell/diagnosis , Mass Spectrometry , Mucopolysaccharidosis I/diagnosis , Niemann-Pick Disease, Type A/diagnosis , Niemann-Pick Disease, Type B/diagnosis , Pilot Projects , Sensitivity and Specificity
9.
Bioorg Med Chem Lett ; 17(6): 1793-8, 2007 Mar 15.
Article in English | MEDLINE | ID: mdl-17239587

ABSTRACT

KDR kinase inhibition is considered to play an important role in regulating angiogenesis, which is vital for the survival and proliferation of tumor cells. Recently we disclosed a structure-based kinase inhibitor design strategy which led to the identification of a new class of VEGFR-2/KDR kinase inhibitors bearing heterocyclic substituted pyrazolones as the core template. Instability in a rat S9 preparation and poor iv PK profiles for most of these inhibitors necessitated exploration of new pyrazolones to identify new analogs with improved metabolic stability. Optimization of the heterocyclic moiety led to the identification of the thiadiazole series of pyrazolones (D) as potent VEGFR-2/KDR kinase inhibitors. SAR modifications, kinase selectivity profiling, and structural elements for improved PK properties were explored. Oral bioavailability up to 29% was achieved in the rat. Modeling results based on the Glide XP docking approach supported our postulation regarding the interaction of the lactam segment of the pyrazolones with the hinge region of the KDR kinase.


Subject(s)
Enzyme Inhibitors/chemical synthesis , Enzyme Inhibitors/pharmacology , Pyrazolones/chemical synthesis , Pyrazolones/pharmacology , Vascular Endothelial Growth Factor Receptor-2/antagonists & inhibitors , Animals , Autoradiography , Biological Availability , Blotting, Western , Drug Design , Enzyme Inhibitors/pharmacokinetics , Humans , In Vitro Techniques , Models, Molecular , Protease Inhibitors/chemical synthesis , Protease Inhibitors/pharmacology , Pyrazolones/pharmacokinetics , Rats , Recombinant Proteins , Spectrometry, Fluorescence , Structure-Activity Relationship
SELECTION OF CITATIONS
SEARCH DETAIL
...