Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 10 de 10
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
IEEE/ACM Trans Comput Biol Bioinform ; 17(6): 2086-2097, 2020.
Artigo em Inglês | MEDLINE | ID: mdl-31056513

RESUMO

In this paper, we consider the pair-wise semiglobal sequence alignment problem with gaps, which is motivated by the re-sequencing problem that requires to assemble short reads sequences into a genome sequence by referring to a reference sequence. The problem has been studied before for single gap and bounded number of gaps. For single gap, there is a GPU-based algorithm proposed (Barton et al., 2015). In our work, we propose a GPU-based algorithm for the bounded number of gaps case, called GPUGapsMis. We implement the algorithm and compare the performance with the CPU-based algorithm, called CPUGapsMis. The algorithm has two distinct stages: the alignment phase, and the backtrack phase. We investigate several different approaches, in order to determine the most favorable for this problem, by means of a Hybrid model or a wholly-GPU based model, as well as the alignment of single text sequences or multiple text sequences on the GPU at a time. We show that the alignment phase of the algorithm is a good candidate for parallelization, with peak speedup of 11 times. We show that although the backtracking phase is sequential, it is more beneficial to perform it on the GPU, as opposed to returning to the CPU and performing there. When performing both phases on the GPU, GPUGapsMis achieves a peak speedup of 10.4 times against CPUGapsMis. Our data parallel GPU algorithm achieves results which are an improvement on those of an existing GPU data parallel implementation (Ojiaku, 2014).


Assuntos
Genômica/métodos , Processamento de Imagem Assistida por Computador/métodos , Alinhamento de Sequência/métodos , Análise de Sequência de DNA/métodos , Algoritmos , Gráficos por Computador , Bases de Dados Genéticas
2.
Artigo em Inglês | MEDLINE | ID: mdl-29364851

RESUMO

Affected by regular tides, bidirectional water flows play a crucial role in surface river systems. Using optimization theory to design a water quality monitoring network can reduce the redundant monitoring nodes as well as save the costs for building and running a monitoring network. A novel algorithm is proposed to design an optimum water quality monitoring network for tidal rivers with bidirectional water flows. Two optimization objectives of minimum pollution detection time and maximum pollution detection probability are used in our optimization algorithm. We modify the Multi-Objective Particle Swarm Optimization (MOPSO) algorithm and develop new fitness functions to calculate pollution detection time and pollution detection probability in a discrete manner. In addition, the Storm Water Management Model (SWMM) is used to simulate hydraulic characteristics and pollution events based on a hypothetical river system studied in the literature. Experimental results show that our algorithm can obtain a better Pareto frontier. The influence of bidirectional water flows to the network design is also identified, which has not been studied in the literature. Besides that, we also find that the probability of bidirectional water flows has no effect on the optimum monitoring network design but slightly changes the mean pollution detection time.


Assuntos
Algoritmos , Monitoramento Ambiental/métodos , Rios , Poluição da Água/análise , Qualidade da Água
3.
ScientificWorldJournal ; 2013: 230471, 2013.
Artigo em Inglês | MEDLINE | ID: mdl-24298205

RESUMO

Precise photovoltaic (PV) behavior models are normally described by nonlinear analytical equations. To solve such equations, it is necessary to use iterative procedures. Aiming to make the computation easier, this paper proposes an approximate single-diode PV model that enables high-speed predictions for the electrical characteristics of commercial PV modules. Based on the experimental data, statistical analysis is conducted to validate the approximate model. Simulation results show that the calculated current-voltage (I-V) characteristics fit the measured data with high accuracy. Furthermore, compared with the existing modeling methods, the proposed model reduces the simulation time by approximately 30% in this work.


Assuntos
Equipamentos e Provisões Elétricas , Modelos Teóricos , Simulação por Computador
4.
BMC Genomics ; 14: 847, 2013 Dec 03.
Artigo em Inglês | MEDLINE | ID: mdl-24299161

RESUMO

BACKGROUND: The filamentous fungus Aspergillus nidulans has been a tractable model organism for cell biology and genetics for over 60 years. It is among a large number of Aspergilli whose genomes have been sequenced since 2005, including medically and industrially important species. In order to advance our knowledge of its biology and increase its utility as a genetic model by improving gene annotation we sequenced the transcriptome of A. nidulans with a focus on 5' end analysis. RESULTS: Strand-specific whole transcriptome sequencing showed that 80-95% of annotated genes appear to be expressed across the conditions tested. We estimate that the total gene number should be increased by approximately 1000, to 11,800. With respect to splicing 8.3% of genes had multiple alternative transcripts, but alternative splicing by exon-skipping was very rare. 75% of annotated genes showed some level of antisense transcription and for one gene, meaB, we demonstrated the antisense transcript has a regulatory role. Specific sequencing of the 5' ends of transcripts was used for genome wide mapping of transcription start sites, allowing us to interrogate over 7000 promoters and 5' untranslated regions. CONCLUSIONS: Our data has revealed the complexity of the A. nidulans transcriptome and contributed to improved genome annotation. The data can be viewed on the AspGD genome browser.


Assuntos
Aspergillus nidulans/genética , Perfilação da Expressão Gênica , Regulação Fúngica da Expressão Gênica , Regiões Promotoras Genéticas , Transcriptoma , Regiões 5' não Traduzidas , Processamento Alternativo , Sequência de Bases , Sequenciamento de Nucleotídeos em Larga Escala , Íntrons , Anotação de Sequência Molecular , Dados de Sequência Molecular , Motivos de Nucleotídeos , Fases de Leitura Aberta , Matrizes de Pontuação de Posição Específica , RNA Antissenso , RNA não Traduzido/genética , Alinhamento de Sequência , Sítio de Iniciação de Transcrição , Transcrição Gênica
5.
PLoS One ; 6(6): e21507, 2011.
Artigo em Inglês | MEDLINE | ID: mdl-21720552

RESUMO

Binding of calcium ions (Ca²âº) to proteins can have profound effects on their structure and function. Common roles of calcium binding include structure stabilization and regulation of activity. It is known that diverse families--EF-hands being one of at least twelve--use a Dx[DN]xDG linear motif to bind calcium in near-identical fashion. Here, four novel structural contexts for the motif are described. Existing experimental data for one of them, a thermophilic archaeal subtilisin, demonstrate for the first time a role for Dx[DN]xDG-bound calcium in protein folding. An integrin-like embedding of the motif in the blade of a ß-propeller fold--here named the calcium blade--is discovered in structures of bacterial and fungal proteins. Furthermore, sensitive database searches suggest a common origin for the calcium blade in ß-propeller structures of different sizes and a pan-kingdom distribution of these proteins. Factors favouring the multiple convergent evolution of the motif appear to include its general Asp-richness, the regular spacing of the Asp residues and the fact that change of Asp into Gly and vice versa can occur though a single nucleotide change. Among the known structural contexts for the Dx[DN]xDG motif, only the calcium blade and the EF-hand are currently found intracellularly in large numbers, perhaps because the higher extracellular concentration of Ca²âº allows for easier fixing of newly evolved motifs that have acquired useful functions. The analysis presented here will inform ongoing efforts toward prediction of similar calcium-binding motifs from sequence information alone.


Assuntos
Proteínas de Ligação ao Cálcio/química , Proteínas de Ligação ao Cálcio/metabolismo , Evolução Molecular , Motivos de Aminoácidos , Sequência de Aminoácidos , Cálcio/metabolismo , Calmodulina/química , Calmodulina/metabolismo , Bases de Dados de Proteínas , Humanos , Modelos Moleculares , Dados de Sequência Molecular , Relação Estrutura-Atividade
6.
J Theor Biol ; 248(3): 512-21, 2007 Oct 07.
Artigo em Inglês | MEDLINE | ID: mdl-17628606

RESUMO

DNA microarray technology, originally developed to measure the level of gene expression, has become one of the most widely used tools in genomic study. The crux of microarray design lies in how to select a unique probe that distinguishes a given genomic sequence from other sequences. Due to its significance, probe selection attracts a lot of attention. Various probe selection algorithms have been developed in recent years. Good probe selection algorithms should produce a small number of candidate probes. Efficiency is also crucial because the data involved are usually huge. Most existing algorithms are usually not sufficiently selective and quite a large number of probes are returned. We propose a new direction to tackle the problem and give an efficient algorithm based on randomization to select a small set of probes and demonstrate that such a small set of probes is sufficient to distinguish each sequence from all the other sequences. Based on the algorithm, we have developed probe selection software RandPS, which runs efficiently in practice. The software is available on our website (http://www.csc.liv.ac.uk/ approximately cindy/RandPS/RandPS.htm). We test our algorithm via experiments on different genomes (Escherichia coli, Saccharamyces cerevisiae, etc.) and our algorithm is able to output unique probes for most of the genes efficiently. The other genes can be identified by a combination of at most two probes.


Assuntos
Algoritmos , Sondas de DNA/genética , Análise de Sequência com Séries de Oligonucleotídeos/métodos , Animais , Arabidopsis/genética , Sequência de Bases , Cromossomos Humanos Par 1/genética , Cromossomos de Mamíferos/genética , Escherichia coli/genética , Genes Bacterianos/genética , Genes Fúngicos/genética , Humanos , Camundongos , Neurospora crassa/genética , Distribuição Aleatória , Saccharomyces cerevisiae/genética , Schizosaccharomyces/genética , Software , Fatores de Tempo
7.
Bioinformatics ; 21(10): 2271-8, 2005 May 15.
Artigo em Inglês | MEDLINE | ID: mdl-15746277

RESUMO

MOTIVATION: For the purpose of locating conserved genes in a whole genome scale, this paper proposes a new structural optimization problem called the Mutated Subsequence Problem, which gives consideration to possible mutations between two species (in the form of reversals and transpositions) when comparing the genomes. RESULTS: A practical algorithm called mutated subsequence algorithm (MSS) is devised to solve this optimization problem, and it has been evaluated using different pairs of human and mouse chromosomes, and different pairs of virus genomes of Baculoviridae. MSS is found to be effective and efficient; in particular, MSS can reveal >90% of the conserved genes of human and mouse that have been reported in the literature. When compared with existing softwares MUMmer and MaxMinCluster, MSS uncovers 14 and 7% more genes on average, respectively. Furthermore, this paper shows a hybrid approach to integrate MUMmer or MaxMinCluster with MSS, which has better performance and reliability.


Assuntos
Algoritmos , Mapeamento Cromossômico/métodos , Sequência Conservada/genética , Análise Mutacional de DNA/métodos , Evolução Molecular , Alinhamento de Sequência/métodos , Análise de Sequência de DNA/métodos , Animais , Humanos , Camundongos , Mutação , Homologia de Sequência do Ácido Nucleico
8.
J Bioinform Comput Biol ; 3(1): 1-18, 2005 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-15751109

RESUMO

The constrained multiple sequence alignment problem is to align a set of sequences of maximum length n subject to a given constrained sequence, which arises from some knowledge of the structure of the sequences. This paper presents new algorithms for this problem, which are more efficient in terms of time and space (memory) than the previous algorithms, and with a worst-case guarantee on the quality of the alignment. Saving the space requirement by a quadratic factor is particularly significant as the previous O(n4)-space algorithm has limited application due to its huge memory requirement. Experiments on real data sets confirm that our new algorithms show improvements in both alignment quality and resource requirements.


Assuntos
Algoritmos , Ribonucleases/análise , Ribonucleases/química , Alinhamento de Sequência/métodos , Análise de Sequência/métodos , Sequência Conservada , Homologia de Sequência
9.
Bioinformatics ; 21(2): 144-51, 2005 Jan 15.
Artigo em Inglês | MEDLINE | ID: mdl-15333460

RESUMO

MOTIVATION: Short interfering RNAs (siRNAs) can be used to suppress gene expression and possess many potential applications in therapy, but how to design an effective siRNA is still not clear. Based on the MPI (Max-Planck-Institute) basic principles, a number of siRNA design tools have been developed recently. The set of candidates reported by these tools is usually large and often contains ineffective siRNAs. In view of this, we initiate the study of filtering ineffective siRNAs. RESULTS: The contribution of this paper is 2-fold. First, we propose a fair scheme to compare existing design tools based on real data in the literature. Second, we attempt to improve the MPI principles and existing tools by an algorithm that can filter ineffective siRNAs. The algorithm is based on some new observations on the secondary structure, which we have verified by AI techniques (decision trees and support vector machines). We have tested our algorithm together with the MPI principles and the existing tools. The results show that our filtering algorithm is effective. AVAILABILITY: The siRNA design software tool can be found in the website http://www.cs.hku.hk/~sirna/ CONTACT: smyiu@cs.hku.hk


Assuntos
Algoritmos , Inteligência Artificial , Desenho Assistido por Computador , Modelos Moleculares , RNA Interferente Pequeno/química , Alinhamento de Sequência/métodos , Análise de Sequência de RNA , Sequência de Bases , Benchmarking/métodos , Engenharia Genética/métodos , Modelos Químicos , Dados de Sequência Molecular , RNA Interferente Pequeno/classificação , RNA Interferente Pequeno/genética , Software
10.
Bioinformatics ; 20(16): 2676-84, 2004 Nov 01.
Artigo em Inglês | MEDLINE | ID: mdl-15145812

RESUMO

MOTIVATION: This paper is concerned with algorithms for aligning two whole genomes so as to identify regions that possibly contain conserved genes. Motivated by existing heuristic-based software tools, we initiate the study of an optimization problem that attempts to uncover conserved genes with a global concern. Another interesting feature in our formulation is the tolerance of noise, which also complicates the optimization problem. A brute-force approach takes time exponential in the noise level. RESULTS: We show how an insight into the optimization structure can lead to a drastic improvement in the time and space requirement [precisely, to O(k2n2) and O(k2n), respectively, where n is the size of the input and k is the noise level]. The reduced space requirement allows us to implement the new algorithm, called MaxMinCluster, on a PC. It is exciting to see that when tested with different real data sets, MaxMinCluster consistently uncovers a high percentage of conserved genes that have been published by GenBank. Its performance is indeed favorably compared to MUMmer (perhaps the most popular software tool for uncovering conserved genes in a whole-genome scale). AVAILABILITY: The source code is available from the website http://www.csis.hku.hk/~colly/maxmincluster/ detailed proof of the propositions can also be found there.


Assuntos
Algoritmos , Mapeamento Cromossômico/métodos , Alinhamento de Sequência/métodos , Análise de Sequência de DNA/métodos , Análise de Sequência de Proteína/métodos , Análise por Conglomerados , Sequência Conservada/genética , Reconhecimento Automatizado de Padrão/métodos , Homologia de Sequência de Aminoácidos , Homologia de Sequência do Ácido Nucleico , Processos Estocásticos
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...