Search | VHL Regional Portal

Massively Multithreaded Maxflow for Image Segmentation on the Cray XMT-2.

Bokhari, Shahid H; Çatalyürek, Ümit V; Gurcan, Metin N.

Concurr Comput ; 26(18): 2836-2855, 2014 Dec 01.

Article in English | MEDLINE | ID: mdl-25598745

ABSTRACT

Image segmentation is a very important step in the computerized analysis of digital images. The maxflow mincut approach has been successfully used to obtain minimum energy segmentations of images in many fields. Classical algorithms for maxflow in networks do not directly lend themselves to efficient parallel implementations on contemporary parallel processors. We present the results of an implementation of Goldberg-Tarjan preflow-push algorithm on the Cray XMT-2 massively multithreaded supercomputer. This machine has hardware support for 128 threads in each physical processor, a uniformly accessible shared memory of up to 4 TB and hardware synchronization for each 64 bit word. It is thus well-suited to the parallelization of graph theoretic algorithms, such as preflow-push. We describe the implementation of the preflow-push code on the XMT-2 and present the results of timing experiments on a series of synthetically generated as well as real images. Our results indicate very good performance on large images and pave the way for practical applications of this machine architecture for image analysis in a production setting. The largest images we have run are 320002 pixels in size, which are well beyond the largest previously reported in the literature.

Reassortment Networks and the evolution of pandemic H1N1 swine-origin influenza.

Bokhari, Shahid H; Pomeroy, Laura W; Janies, Daniel A.

IEEE/ACM Trans Comput Biol Bioinform ; 9(1): 214-27, 2012.

Article in English | MEDLINE | ID: mdl-22076498

ABSTRACT

Prior research developed Reassortment Networks to reconstruct the evolution of segmented viruses under both reassortment and mutation. We report their application to the swine-origin pandemic H1N1 virus (S-OIV). A database of all influenza A viruses, for which complete genome sequences were available in Genbank by October 2009, was created and dynamic programming was used to compute distances between all corresponding segments. A reassortment network was created to obtain the minimum cost evolutionary paths from all viruses to the exemplar S-OIV A/California/04/2009. This analysis took 35 hours on the Cray Extreme Multithreading (XMT) supercomputer, which has special hardware to permit efficient parallelization. Six specific H1N1/H1N2 bottleneck viruses were identified that almost always lie on minimum cost paths to S-OIV. We conjecture that these viruses are crucial to S-OIV evolution and worthy of careful study from a molecular biology viewpoint. In phylogenetics, ancestors are typically medians that have no functional constraints. In our method, ancestors are not inferred, but rather chosen from previously observed viruses along a path of mutation and reassortment leading to the target virus. This specificity and functional constraint render our results actionable for further experiments in vitro and in vivo.

Subject(s)

Computer Communication Networks , Evolution, Molecular , Genome, Viral , Influenza A Virus, H1N1 Subtype/genetics , Influenza, Human/virology , Reassortant Viruses/genetics , Animals , Humans , Influenza, Human/epidemiology , Mutation , Pandemics , Sus scrofa , Swine

Reassortment networks for investigating the evolution of segmented viruses.

Bokhari, Shahid H; Janies, Daniel A.

IEEE/ACM Trans Comput Biol Bioinform ; 7(2): 288-98, 2010.

Article in English | MEDLINE | ID: mdl-20431148

ABSTRACT

Many viruses of interest, such as influenza A, have distinct segments in their genome. The evolution of these viruses involves mutation and reassortment, where segments are interchanged between viruses that coinfect a host. Phylogenetic trees can be constructed to investigate the mutation-driven evolution of individual viral segments. However, reassortment events among viral genomes are not well depicted in such bifurcating trees. We propose the concept of reassortment networks to analyze the evolution of segmented viruses. These are layered graphs in which the layers represent evolutionary stages such as a temporal series of seasons in which influenza viruses are isolated. Nodes represent viral isolates and reassortment events between pairs of isolates. Edges represent evolutionary steps, while weights on edges represent edit costs of reassortment and mutation events. Paths represent possible transformation series among viruses. The length of each path is the sum edit cost of the events required to transform one virus into another. In order to analyze tau stages of evolution of n viruses with segments of maximum length m, we first compute the pairwise distances between all corresponding segments of all viruses in O(m2n2) time using dynamic programming. The reassortment network, with O(taun2) nodes, is then constructed using these distances. The ancestors and descendents of a specific virus can be traced via shortest paths in this network, which can be found in O(taun3) time.

Subject(s)

Evolution, Molecular , Influenza A virus/genetics , Models, Genetic , Reassortant Viruses/genetics , Algorithms , Animals , Birds , Gene Rearrangement , Humans , Phylogeny

A parallel graph decomposition algorithm for DNA sequencing with nanopores.

Bokhari, Shahid H; Sauer, Jon R.

Bioinformatics ; 21(7): 889-96, 2005 Apr 01.

Article in English | MEDLINE | ID: mdl-15539451

ABSTRACT

MOTIVATION: With the potential availability of nanopore devices that can sense the bases of translocating single-stranded DNA (ssDNA), it is likely that 'reads' of length approximately 10(5) will be available in large numbers and at high speed. We address the problem of complete DNA sequencing using such reads. We assume that approximately 10(2) copies of a DNA sequence are split into single strands that break into randomly sized pieces as they translocate the nanopore in arbitrary orientations. The nanopore senses and reports each individual base that passes through, but all information about orientation and complementarity of the ssDNA subsequences is lost. Random errors (both biological and transduction) in the reads create further complications. RESULTS: We have developed an algorithm that addresses these issues. It can be considered an extreme variation of the well-known Eulerian path approach. It searches over a space of de Bruijn graphs until it finds one in which (a) the impact of errors is eliminated and (b) both possible orientations of the two ssDNA sequences can be identified separately and unambiguously. Our algorithm is able to correctly reconstruct real DNA sequences of the order of 10(6) bases (e.g. the bacterium Mycoplasma pneumoniae) from simulated erroneous reads on a modest workstation in about 1 h. We describe, and give measured timings of, a parallel implementation of this algorithm on the Cray Multithreaded Architecture (MTA-2) supercomputer, whose architecture is ideally suited to this 'unstructured' problem. Our parallel implementation is crucial to the problem of rapidly sequencing long DNA sequences and also to the situation where multiple nanopores are used to obtain a high-bandwidth stream of reads.

Subject(s)

Algorithms , Chromosome Mapping/methods , Mycoplasma pneumoniae/genetics , Nanotechnology/methods , Oligonucleotide Array Sequence Analysis/methods , Sequence Alignment/methods , Sequence Analysis, DNA/methods , Base Sequence , Computing Methodologies , DNA, Bacterial/analysis , DNA, Bacterial/genetics , Genome, Bacterial , Molecular Sequence Data , Numerical Analysis, Computer-Assisted

Parallelizing a DNA simulation code for the Cray MTA-2.

Bokhari, Shahid H; Glaser, Matthew A; Jordan, Harry F; Lansac, Yves; Sauer, Jon R; Van Zeghbroeck, Bart.

Proc IEEE Comput Soc Bioinform Conf ; 1: 291-302, 2002.

Article in English | MEDLINE | ID: mdl-15838145

ABSTRACT

The Cray MTA-2 (Multithreaded Architecture) is an unusual parallel supercomputer that promises ease of use and high performance. We describe our experience on the MTA-2 with a molecular dynamics code, SIMU-MD, that we are using to simulate the translocation of DNA through a nanopore in a silicon based ultrafast sequencer. Our sequencer is constructed using standard VLSI technology and consists of a nanopore surrounded by Field Effect Transistors (FETs). We propose to use the FETs to sense variations in charge as a DNA molecule translocates through the pore and thus differentiate between the four building block nucleotides of DNA. We were able to port SIMU-MD, a serial C code, to the MTA with only a modest effort and with good performance. Our porting process needed neither a parallelism support platform nor attention to the intimate details of parallel programming and interprocessor communication, as would have been the case with more conventional supercomputers.

Subject(s)

Biosensing Techniques/methods , Computing Methodologies , DNA/chemistry , Models, Chemical , Models, Molecular , Sequence Analysis, DNA/methods , Software , Algorithms , DNA/analysis , Diffusion , Membranes, Artificial , Motion , Nucleic Acid Conformation , Porosity , Structure-Activity Relationship

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL