Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 10 de 10
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
BMC Bioinformatics ; 24(1): 117, 2023 Mar 26.
Artigo em Inglês | MEDLINE | ID: mdl-36967390

RESUMO

BACKGROUND: Biomedical researchers use alignments produced by BLAST (Basic Local Alignment Search Tool) to categorize their query sequences. Producing such alignments is an essential bioinformatics task that is well suited for the cloud. The cloud can perform many calculations quickly as well as store and access large volumes of data. Bioinformaticians can also use it to collaborate with other researchers, sharing their results, datasets and even their pipelines on a common platform. RESULTS: We present ElasticBLAST, a cloud native application to perform BLAST alignments in the cloud. ElasticBLAST can handle anywhere from a few to many thousands of queries and run the searches on thousands of virtual CPUs (if desired), deleting resources when it is done. It uses cloud native tools for orchestration and can request discounted instances, lowering cloud costs for users. It is supported on Amazon Web Services and Google Cloud Platform. It can search BLAST databases that are user provided or from the National Center for Biotechnology Information. CONCLUSION: We show that ElasticBLAST is a useful application that can efficiently perform BLAST searches for the user in the cloud, demonstrating that with two examples. At the same time, it hides much of the complexity of working in the cloud, lowering the threshold to move work to the cloud.


Assuntos
Computação em Nuvem , Software , Biologia Computacional/métodos , Bases de Dados Factuais , Custos e Análise de Custo
2.
bioRxiv ; 2023 Jan 04.
Artigo em Inglês | MEDLINE | ID: mdl-36789435

RESUMO

Background: Biomedical researchers use alignments produced by BLAST (Basic Local Alignment Search Tool) to categorize their query sequences. Producing such alignments is an essential bioinformatics task that is well suited for the cloud. The cloud can perform many calculations quickly as well as store and access large volumes of data. Bioinformaticians can also use it to collaborate with other researchers, sharing their results, datasets and even their pipelines on a common platform. Results: We present ElasticBLAST, a cloud native application to perform BLAST alignments in the cloud. ElasticBLAST can handle anywhere from a few to many thousands of queries and run the searches on thousands of virtual CPUs (if desired), deleting resources when it is done. It uses cloud native tools for orchestration and can request discounted instances, lowering cloud costs for users. It is supported on Amazon Web Services and Google Cloud Platform. It can search BLAST databases that are user provided or from the National Center for Biotechnology Information. Conclusion: We show that ElasticBLAST is a useful application that can efficiently perform BLAST searches for the user in the cloud, demonstrating that with two examples. At the same time, it hides much of the complexity of working in the cloud, lowering the threshold to move work to the cloud.

3.
BMC Bioinformatics ; 20(1): 405, 2019 Jul 25.
Artigo em Inglês | MEDLINE | ID: mdl-31345161

RESUMO

BACKGROUND: Next-generation sequencing technologies can produce tens of millions of reads, often paired-end, from transcripts or genomes. But few programs can align RNA on the genome and accurately discover introns, especially with long reads. We introduce Magic-BLAST, a new aligner based on ideas from the Magic pipeline. RESULTS: Magic-BLAST uses innovative techniques that include the optimization of a spliced alignment score and selective masking during seed selection. We evaluate the performance of Magic-BLAST to accurately map short or long sequences and its ability to discover introns on real RNA-seq data sets from PacBio, Roche and Illumina runs, and on six benchmarks, and compare it to other popular aligners. Additionally, we look at alignments of human idealized RefSeq mRNA sequences perfectly matching the genome. CONCLUSIONS: We show that Magic-BLAST is the best at intron discovery over a wide range of conditions and the best at mapping reads longer than 250 bases, from any platform. It is versatile and robust to high levels of mismatches or extreme base composition, and reasonably fast. It can align reads to a BLAST database or a FASTA file. It can accept a FASTQ file as input or automatically retrieve an accession from the SRA repository at the NCBI.


Assuntos
RNA/genética , Alinhamento de Sequência , Análise de Sequência de RNA/métodos , Software , Algoritmos , Sequência de Bases , Bases de Dados de Ácidos Nucleicos , Humanos , Íntrons/genética , Curva ROC , Fatores de Tempo
4.
Nucleic Acids Res ; 41(Web Server issue): W29-33, 2013 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-23609542

RESUMO

The Basic Local Alignment Search Tool (BLAST) website at the National Center for Biotechnology (NCBI) is an important resource for searching and aligning sequences. A new BLAST report allows faster loading of alignments, adds navigation aids, allows easy downloading of subject sequences and reports and has improved usability. Here, we describe these improvements to the BLAST report, discuss design decisions, describe other improvements to the search page and database documentation and outline plans for future development. The NCBI BLAST URL is http://blast.ncbi.nlm.nih.gov.


Assuntos
Alinhamento de Sequência/métodos , Software , Animais , Genômica , Internet , L-Gulonolactona Oxidase/genética , Ratos
5.
Biol Direct ; 7: 12, 2012 Apr 17.
Artigo em Inglês | MEDLINE | ID: mdl-22510480

RESUMO

BACKGROUND: BLAST is a commonly-used software package for comparing a query sequence to a database of known sequences; in this study, we focus on protein sequences. Position-specific-iterated BLAST (PSI-BLAST) iteratively searches a protein sequence database, using the matches in round i to construct a position-specific score matrix (PSSM) for searching the database in round i + 1. Biegert and Söding developed Context-sensitive BLAST (CS-BLAST), which combines information from searching the sequence database with information derived from a library of short protein profiles to achieve better homology detection than PSI-BLAST, which builds its PSSMs from scratch. RESULTS: We describe a new method, called domain enhanced lookup time accelerated BLAST (DELTA-BLAST), which searches a database of pre-constructed PSSMs before searching a protein-sequence database, to yield better homology detection. For its PSSMs, DELTA-BLAST employs a subset of NCBI's Conserved Domain Database (CDD). On a test set derived from ASTRAL, with one round of searching, DELTA-BLAST achieves a ROC5000 of 0.270 vs. 0.116 for CS-BLAST. The performance advantage diminishes in iterated searches, but DELTA-BLAST continues to achieve better ROC scores than CS-BLAST. CONCLUSIONS: DELTA-BLAST is a useful program for the detection of remote protein homologs. It is available under the "Protein BLAST" link at http://blast.ncbi.nlm.nih.gov.


Assuntos
Bases de Dados de Proteínas , Estrutura Terciária de Proteína , Ferramenta de Busca/métodos , Software , Algoritmos , Biologia Computacional/métodos , Internet , Curva ROC , Reprodutibilidade dos Testes , Sensibilidade e Especificidade , Alinhamento de Sequência/métodos , Análise de Sequência de Proteína/métodos , Homologia de Sequência de Aminoácidos , Fatores de Tempo
6.
J Am Soc Nephrol ; 20(9): 2065-74, 2009 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-19643930

RESUMO

One third of patients with type 1 diabetes and microalbuminuria experience an early, progressive decline in renal function that leads to advanced stages of chronic kidney disease and ESRD. We hypothesized that the urinary proteome may distinguish between stable renal function and early renal function decline among patients with type 1 diabetes and microalbuminuria. We followed patients with normal renal function and microalbuminuria for 10 to 12 yr and classified them into case patients (n = 21) with progressive early renal function decline and control subjects (n = 40) with stable renal function. Using liquid chromatography matrix-assisted laser desorption/ionization time-of-flight mass spectrometry, we identified three peptides that decreased in the urine of patients with early renal function decline [fragments of alpha1(IV) and alpha1(V) collagens and tenascin-X] and three peptides that increased (fragments of inositol pentakisphosphate 2-kinase, zona occludens 3, and FAT tumor suppressor 2). In renal biopsies from patients with early nephropathy from type 1 diabetes, we observed increased expression of inositol pentakisphosphate 2-kinase, which was present in granule-like cytoplasmic structures, and zona occludens 3. These results indicate that urinary peptide fragments reflect changes in expression of intact protein in the kidney, suggesting new potential mediators of diabetic nephropathy and candidate biomarkers for progressive renal function decline.


Assuntos
Albuminúria/urina , Diabetes Mellitus Tipo 1/urina , Nefropatias Diabéticas/urina , Peptídeos/urina , Adulto , Albuminúria/patologia , Albuminúria/fisiopatologia , Biomarcadores/urina , Biópsia , Caderinas/urina , Proteínas de Transporte/urina , Diabetes Mellitus Tipo 1/fisiopatologia , Nefropatias Diabéticas/patologia , Nefropatias Diabéticas/fisiopatologia , Progressão da Doença , Humanos , Rim/metabolismo , Rim/patologia , Proteínas de Membrana/urina , Fosfotransferases (Aceptor do Grupo Álcool)/urina , Proteínas de Ligação a Poli(A)/metabolismo , Valor Preditivo dos Testes , Transdução de Sinais/fisiologia , Antígeno-1 Intracelular de Células T , Adulto Jovem , Proteínas da Zônula de Oclusão
7.
Bioinformation ; 1(10): 396-405, 2007 Apr 10.
Artigo em Inglês | MEDLINE | ID: mdl-17597929

RESUMO

UNLABELLED: In this paper we propose a data based algorithm to marry existing biological knowledge (e.g., functional annotations of genes) with experimental data (gene expression profiles) in creating an overall dissimilarity that can be used with any clustering algorithm that uses a general dissimilarity matrix. We explore this idea with two publicly available gene expression data sets and functional annotations where the results are compared with the clustering results that uses only the experimental data. Although more elaborate evaluations might be called for, the present paper makes a strong case for utilizing existing biological information in the clustering process. AVAILABILITY: Supplement is available at www.somnathdatta.org/Supp/Bioinformation/appendix.pdf.

8.
BMC Bioinformatics ; 7 Suppl 2: S8, 2006 Sep 06.
Artigo em Inglês | MEDLINE | ID: mdl-17118151

RESUMO

BACKGROUND: Independent Component Analysis (ICA) proves to be useful in the analysis of neural activity, as it allows for identification of distinct sources of activity. Applied to measurements registered in a controlled setting and under exposure to an external stimulus, it can facilitate analysis of the impact of the stimulus on those sources. The link between the stimulus and a given source can be verified by a classifier that is able to "predict" the condition a given signal was registered under, solely based on the components. However, the ICA's assumption about statistical independence of sources is often unrealistic and turns out to be insufficient to build an accurate classifier. Therefore, we propose to utilize a novel method, based on hybridization of ICA, multi-objective evolutionary algorithms (MOEA), and rough sets (RS), that attempts to improve the effectiveness of signal decomposition techniques by providing them with "classification-awareness." RESULTS: The preliminary results described here are very promising and further investigation of other MOEAs and/or RS-based classification accuracy measures should be pursued. Even a quick visual analysis of those results can provide an interesting insight into the problem of neural activity analysis. CONCLUSION: We present a methodology of classificatory decomposition of signals. One of the main advantages of our approach is the fact that rather than solely relying on often unrealistic assumptions about statistical independence of sources, components are generated in the light of a underlying classification problem itself.


Assuntos
Encéfalo/fisiologia , Biologia Computacional/métodos , Potenciais Evocados , Algoritmos , Animais , Ratos
9.
Conf Proc IEEE Eng Med Biol Soc ; 2006: 5515-8, 2006.
Artigo em Inglês | MEDLINE | ID: mdl-17947147

RESUMO

Cluster analysis has become a standard part of gene expression analysis. In this paper, we propose a novel semi-supervised approach that offers the same flexibility as that of a hierarchical clustering. Yet it utilizes, along with the experimental gene expression data, common biological information about different genes that is being complied at various public, Web accessible databases. We argue that such an approach is inherently superior than the standard unsupervised approach of grouping genes based on expression data alone. It is shown that our biologically supervised methods produce better clustering results than the corresponding unsupervised methods as judged by the distance from the model temporal profiles. R-codes of the clustering algorithm are available from the authors upon request.


Assuntos
Análise por Conglomerados , Biologia Computacional/métodos , Perfilação da Expressão Gênica/instrumentação , Perfilação da Expressão Gênica/métodos , Algoritmos , Inteligência Artificial , Simulação por Computador , Expressão Gênica , Regulação da Expressão Gênica , Humanos , Modelos Estatísticos , Modelos Teóricos , Análise de Sequência com Séries de Oligonucleotídeos , Reconhecimento Automatizado de Padrão , Análise de Sequência de DNA
10.
Conf Proc IEEE Eng Med Biol Soc ; 2006: 5798-801, 2006.
Artigo em Inglês | MEDLINE | ID: mdl-17947169

RESUMO

Interpretation and classification of mass spectra is usually performed using a list of peaks as their mathematical representation. This fact makes peak detection a bottleneck of mass spectra analysis, since quality and biological relevance of any results strongly depends on the accuracy of peak detection process. Many algorithms utilize intensity to differentiate between peaks and noise and thus bias the detection process to the high abundant peaks. However important information may also be contained in the lower-intensity peaks that are more difficult to discover. We present an algorithm specifically designed for detection of low-abundant peaks.


Assuntos
Espectrometria de Massas/métodos , Reconhecimento Automatizado de Padrão , Processamento de Sinais Assistido por Computador , Algoritmos , Calibragem , Humanos , Modelos Estatísticos , Modelos Teóricos , Mapeamento de Peptídeos , Análise Serial de Proteínas , Proteínas , Proteômica , Reprodutibilidade dos Testes , Sensibilidade e Especificidade , Software
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...