Pesquisa | Portal Regional da BVS (teste)

Computational Biology in the 21st Century: Scaling with Compressive Algorithms.

Berger, Bonnie; Daniels, Noah M; Yu, Y William.

Commun ACM ; 59(8): 72-80, 2016 Aug.

Artigo em Inglês | MEDLINE | ID: mdl-28966343

Entropy-scaling search of massive biological data.

Yu, Y William; Daniels, Noah M; Danko, David Christian; Berger, Bonnie.

Cell Syst ; 1(2): 130-140, 2015 Aug 26.

Artigo em Inglês | MEDLINE | ID: mdl-26436140

RESUMO

Many data sets exhibit well-defined structure that can be exploited to design faster search tools, but it is not always clear when such acceleration is possible. Here we introduce a framework for similarity search based on characterizing a data set's entropy and fractal dimension. We prove that searching scales in time with metric entropy (number of covering hyperspheres), if the fractal dimension of the data set is low, and scales in space with the sum of metric entropy and information-theoretic entropy (randomness of the data). Using these ideas, we present accelerated versions of standard tools, with no loss in specificity and little loss in sensitivity, for use in three domains-high-throughput drug screening (Ammolite, 150x speedup), metagenomics (MICA, 3.5x speedup of DIAMOND (3700x BLASTX)), and protein structure search (esFragBag, 10x speedup of FragBag). Our framework can be used to achieve 'compressive omics,' and the general theory can be readily applied to data science problems outside of biology. Source code: http://gems.csail.mit.edu.

Quality score compression improves genotyping accuracy.

Yu, Y William; Yorukoglu, Deniz; Peng, Jian; Berger, Bonnie.

Nat Biotechnol ; 33(3): 240-3, 2015 Mar.

Artigo em Inglês | MEDLINE | ID: mdl-25748910

Assuntos

Compressão de Dados , Técnicas de Genotipagem/normas , Algoritmos , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Curva ROC

Traversing the k-mer Landscape of NGS Read Datasets for Quality Score Sparsification.

Yu, Y William; Yorukoglu, Deniz; Berger, Bonnie.

Res Comput Mol Biol ; 8394: 385-399, 2014 Apr.

Artigo em Inglês | MEDLINE | ID: mdl-28825060

RESUMO

It is becoming increasingly impractical to indefinitely store raw sequencing data for later processing in an uncompressed state. In this paper, we describe a scalable compressive framework, Read-Quality-Sparsifier (RQS), which substantially outperforms the compression ratio and speed of other de novo quality score compression methods while maintaining SNP-calling accuracy. Surprisingly, RQS also improves the SNP-calling accuracy on a gold-standard, real-life sequencing dataset (NA12878) using a k-mer density profile constructed from 77 other individuals from the 1000 Genomes Project. This improvement in downstream accuracy emerges from the observation that quality score values within NGS datasets are inherently encoded in the k-mer landscape of the genomic sequences. To our knowledge, RQS is the first scalable sequence based quality compression method that can efficiently compress quality scores of terabyte-sized and larger sequencing datasets. AVAILABILITY: An implementation of our method, RQS, is available for download at: http://rqs.csail.mit.edu/.

Identification of endogenous acyl amino acids based on a targeted lipidomics approach.

Tan, Bo; O'Dell, David K; Yu, Y William; Monn, M Francesca; Hughes, H Velocity; Burstein, Sumner; Walker, J Michael.

J Lipid Res ; 51(1): 112-9, 2010 Jan.

Artigo em Inglês | MEDLINE | ID: mdl-19584404

RESUMO

Using a partially purified bovine brain extract, our lab identified three novel endogenous acyl amino acids in mammalian tissues. The presence of numerous amino acids in the body and their ability to form amides with several saturated and unsaturated fatty acids indicated the potential existence of a large number of heretofore unidentified acyl amino acids. Reports of several additional acyl amino acids that activate G-protein coupled receptors (e.g., N-arachidonoyl glycine, N-arachidonoyl serine) and transient receptor potential channels (e.g., N-arachidonoyl dopamine, N-acyl taurines) suggested that some or many novel acyl amino acids could serve as signaling molecules. Here, we used a targeted lipidomics approach including specific enrichment steps, nano-LC/MS/MS, high-throughput screening of the datasets with a potent search algorithm based on fragment ion analysis, and quantification using the multiple reaction monitoring mode in Analyst software to measure the biological levels of acyl amino acids in rat brain. We successfully identified 50 novel endogenous acyl amino acids present at 0.2 to 69 pmol g(-1) wet rat brain.

Assuntos

Aminoácidos/análise , Encéfalo/metabolismo , Cromatografia Líquida de Alta Pressão/métodos , Espectrometria de Massas em Tandem/métodos , Animais , Bovinos , Metabolismo dos Lipídeos , Masculino , Receptores Ativados por Proliferador de Peroxissomo/metabolismo , Ratos , Ratos Sprague-Dawley , Receptores Acoplados a Proteínas G/agonistas , Extração em Fase Sólida , Ácido gama-Aminobutírico/metabolismo

Targeted lipidomics approach for endogenous N-acyl amino acids in rat brain tissue.

Tan, Bo; Yu, Y William; Monn, M Francesca; Hughes, H Velocity; O'Dell, David K; Walker, J Michael.

J Chromatogr B Analyt Technol Biomed Life Sci ; 877(26): 2890-4, 2009 Sep 15.

Artigo em Inglês | MEDLINE | ID: mdl-19168403

RESUMO

Great effort has been devoted to characterize signaling lipids in central nervous system. This has led to a search for novel strategies to characterize hitherto unknown lipid compositions. Here we developed two methods, one for identification and one for quantification, for N-acyl amino acids, a novel lipid family. The identification method contains a series of purification steps followed by nano-LC/MS/MS and high-throughput screening of the datasets with a potent search algorithm based on fragment ion analysis. MS/MS spectra with good quality can be obtained with 150 fmol of targeted lipids on column with our nano-LC/MS/MS. More than one thousand mass spectra generated using the information dependent acquisition mode of Analyst QS software can be analyzed in 1 min using our home built software. The quantification method utilized the multiple reaction monitoring mode in Analyst software to measure the endogenous levels of N-acyl amino acids in rat brain. Using these two methods we were able to identify and quantify 11 previously reported N-acyl amino acids with endogenous levels ranging from 0.26 to 333 pmol g(-1) wet rat brain.

Assuntos

Aminoácidos/química , Química Encefálica , Cromatografia Líquida/métodos , Lipídeos/química , Espectrometria de Massas em Tandem/métodos , Animais , Masculino , Ratos , Ratos Sprague-Dawley , Software

Targeted lipidomics: discovery of new fatty acyl amides.

Tan, Bo; Bradshaw, Heather B; Rimmerman, Neta; Srinivasan, Harini; Yu, Y William; Krey, Jocelyn F; Monn, M Francesca; Chen, Jay Shih-Chieh; Hu, Sherry Shu-Jung; Pickens, Sarah R; Walker, J Michael.

AAPS J ; 8(3): E461-5, 2006 Jul 14.

Artigo em Inglês | MEDLINE | ID: mdl-17025263

RESUMO

The discovery of endogenous fatty acyl amides such as N-arachidonoyl ethanolamide (anandamide), N-oleoyl ethanolamide (OEA), and N-arachidonoyl dopamine (NADA) as important signaling molecules in the central and peripheral nervous system has led us to pursue other unidentified signaling molecules. Until recently, technical challenges, particularly those associated with lipid purification and chemical analysis, have hindered the identification of low abundance signaling lipids. Improvements in chromatography and mass spectrometry (MS) such as miniaturization of high-performance liquid chromatography components, hybridization of multistage mass spectrometers and time-of-flight technology, the development of electrospray ionization (ESI) and of information-dependent acquisition, now permit rapid identification of novel, low abundance, signaling lipids.

Assuntos

Amidas/análise , Ácidos Graxos/análise , Amidas/química , Ácidos Araquidônicos/análise , Cromatografia Líquida de Alta Pressão , Endocanabinoides , Metabolismo dos Lipídeos , Lipídeos/química , Alcamidas Poli-Insaturadas , Transdução de Sinais/fisiologia

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA