Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 39
Filtrar
1.
Sci Rep ; 14(1): 6521, 2024 Mar 19.
Artigo em Inglês | MEDLINE | ID: mdl-38499637

RESUMO

Grid computing emerged as a powerful computing domain for running large-scale parallel applications. Scheduling computationally intensive parallel applications such as scientific, commercial etc., computational grids is a NP-complete problem. Many researchers have proposed several task scheduling algorithms on grids based on formulating and solving it as an optimization problem with different objective functions such as makespan, cost, energy etc. Further to address the requirements/demands/needs of the users (lesser cost, lower latency etc.) and grid service providers (high utilization and high profitability), a task scheduler needs to be designed based on solving a multi-objective optimization problem due to several trade-offs among the objective functions. In this direction, we propose an efficient multi-objective task scheduling framework to schedule computationally intensive tasks on heterogeneous grid networks. This framework minimizes turnaround time, communication, and execution costs while maximizing grid utilization. We evaluated the performance of our proposed algorithm through experiments conducted on standard, random, and scientific task graphs using the GridSim simulator.

2.
Sci Total Environ ; 860: 160506, 2023 Feb 20.
Artigo em Inglês | MEDLINE | ID: mdl-36442625

RESUMO

Pathogenic bacteria pose a great threat to global public health from environmental and public health perspectives, especially regarding the impact of the COVID-19 pandemic worldwide. As a result, the increased risk of pathogenic bioaerosol exposure imposes a considerable health burden and raises specific concerns about the layout and location of vaccine manufacturers. This study proposed a grid computing method based on the CALPUFF modelling system and population-based environmental risks to reduce bioaerosol-related potential risks. We previously used the CALPUFF model to quantify the diffusion level, the spatial distribution of emissions, and potential environmental risks of bioaerosol leakage in Gansu province's Zhongmu Lanzhou biopharmaceutical plant from July 24, 2019, to August 20, 2019. By combining it with publicly available test data, the credibility was confirmed. Based on our previous research, the CALPUFF model application combined with the environmental population-based environmental risks in two scenarios: the layout and site selection, was explored by using the leakage accident of Zhongmu Lanzhou biopharmaceutical plant of Gansu province as a case study. Our results showed that the site selection method of scenario 2 coupled with the buffer area was more reasonable than scenario 1, and the final layout site selection point of scenario 2 was grid 157 as the optimal layout point. The simulation results demonstrated agreement with the actual survey. Our findings could assist global bioaerosol manufacturers in developing appropriate layout and site selection strategies to reduce bioaerosol-related potential environmental risks.


Assuntos
Produtos Biológicos , COVID-19 , Humanos , Pandemias , Aerossóis e Gotículas Respiratórios , Saúde Pública , China
3.
Sensors (Basel) ; 22(10)2022 May 15.
Artigo em Inglês | MEDLINE | ID: mdl-35632175

RESUMO

Recently, the number of users and the demand for live-streaming services have increased. This has exponentially increased the traffic to such services, and live-streaming service platforms in Korea use a grid computing system that distributes traffic to users and reduces traffic loads. However, ensuring security with a grid computing system is difficult because the system exchanges general user traffic in a peer-to-peer (P2P) manner instead of receiving data from an authenticated server. Therefore, in this study, to explore the vulnerabilities of a grid computing system, we investigated a vulnerability discovery framework that involves a three-step analysis process and eight detailed activities. Four types of zero-day vulnerabilities, namely video stealing, information disclosure, denial of service, and remote code execution, were derived by analyzing a live-streaming platform in Korea, as a representative service, using grid computing.


Assuntos
Sistemas Computacionais , Computadores , República da Coreia
4.
Data Brief ; 42: 108104, 2022 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-35434224

RESUMO

To take advantage of the computing power offered by grid and opportunistic resources, the CERN Large Hadron Collider (LHC) experiments have adopted the Pilot-Job paradigm. In this work, we study the DIRAC Site Director, one of the existing Pilot-Job provisioning solutions, mainly developed and used by the beauty experiment (LHCb). The purpose is to improve the Pilot-Job submission rates and the throughput of the jobs on grid resources. To analyze the DIRAC Site Director mechanisms and assess our contributions, we collected data over 12 months from the LHCbDIRAC instance. We extracted data from the DIRAC databases and the logs. Data include (i) evolution of the number of Pilot-Jobs/jobs over time; (ii) slots available in grid Sites; (iii) number of jobs processed per Pilot-Job.

5.
Biomed Phys Eng Express ; 8(2)2022 02 01.
Artigo em Inglês | MEDLINE | ID: mdl-35062008

RESUMO

Background and purpose.This work aims to present the strategy to simulate a clinical linear accelerator based on the geometry provided by the manufacturer and summarize the corresponding experimental validation. Simulations were performed with the Geant4 Monte Carlo code under a grid computing environment. The objective of this contribution is reproducing therapeutic dose distributions in a water phantom with an accuracy less than 2%.Materials and methods.A Geant4 Monte Carlo model of an Elekta Synergy linear accelerator has been established, the simulations were launched in a large grid computing platform. Dose distributions are calculated for a 6 MV photon beam with treatment fields ranging from 5 × 5 cm2to 20 × 20 cm2at a source-surface distance of 100 cm.Results.A high degree of agreement is achieved between the simulation results and the measured data, with dose differences of about 1.03% and 1.96% for the percentage depth dose curves and lateral dose profiles, respectively. This agreement is evaluated by the gamma index comparisons. Over 98% of the points for all simulations meet the restrictive acceptability criteria of 2%/2 mm.Conclusion.We have demonstrated the possibility to establish an accurate linac head Monte Carlo model for dose distribution simulations and quality assurance. Percentage depth dose curves and beam quality indices are in perfect agreement with the measured data with an accuracy of better than 2%.


Assuntos
Aceleradores de Partículas , Simulação por Computador , Método de Monte Carlo , Imagens de Fantasmas , Dosagem Radioterapêutica
6.
Entropy (Basel) ; 22(12)2020 Dec 15.
Artigo em Inglês | MEDLINE | ID: mdl-33333717

RESUMO

This paper applies the entropy-based fractal indexing scheme that enables the grid environment for fast indexing and querying. It addresses the issue of fault tolerance and load balancing-based fractal management to make computational grids more effective and reliable. A fractal dimension of a cloud of points gives an estimate of the intrinsic dimensionality of the data in that space. The main drawback of this technique is the long computing time. The main contribution of the suggested work is to investigate the effect of fractal transform by adding R-tree index structure-based entropy to existing grid computing models to obtain a balanced infrastructure with minimal fault. In this regard, the presented work is going to extend the commonly scheduling algorithms that are built based on the physical grid structure to a reduced logical network. The objective of this logical network is to reduce the searching in the grid paths according to arrival time rate and path's bandwidth with respect to load balance and fault tolerance, respectively. Furthermore, an optimization searching technique is utilized to enhance the grid performance by investigating the optimum number of nodes extracted from the logical grid. The experimental results indicated that the proposed model has better execution time, throughput, makespan, latency, load balancing, and success rate.

7.
Rep Pract Oncol Radiother ; 25(6): 1001-1010, 2020.
Artigo em Inglês | MEDLINE | ID: mdl-33132765

RESUMO

AIM: To evaluate the computation time efficiency of the multithreaded code (G4Linac-MT) in the dosimetry application, using the high performance of the HPC-Marwan grid to determine with high accuracy the initial parameters of the 6 MV photon beam of Varian CLINAC 2100C. BACKGROUND: The difficulty of Monte Carlo methods is the long computation time, this is one of the disadvantages of the Monte Carlo methods. MATERIALS AND METHODS: Calculations are performed by the multithreaded code G4Linac-MT and Geant4.10.04.p02 using the HPC-Marwan computing grid to evaluate the computing speed for each code. The multithreaded version is tested in several CPUs to evaluate the computing speed according to the number of CPUs used. The results were compared to the measurements using different types of comparisons, TPR20.10, penumbra, mean dose error and gamma index. RESULTS: The results obtained for this work indicate a much higher computing time saving for the G4Linac-MT version compared to the Geant4.10.04 version, the computing time decreases with the number of CPUs used, can reach about 12 times if 64CPUs are used. After optimization of the initial electron beam parameters, the results of the dose simulations obtained for this work are in very good agreement with the experimental measurements with a mean dose error of up to 0.41% on the PDDs and 1.79% on the lateral dose. CONCLUSIONS: The gain in computation time leads us to perform Monte Carlo simulations with a large number of events which gives a high accuracy of the dosimetry results obtained in this work.

8.
PeerJ ; 8: e9762, 2020.
Artigo em Inglês | MEDLINE | ID: mdl-32953263

RESUMO

BACKGROUND: A prime objective in metagenomics is to classify DNA sequence fragments into taxonomic units. It usually requires several stages: read's quality control, de novo assembly, contig annotation, gene prediction, etc. These stages need very efficient programs because of the number of reads from the projects. Furthermore, the complexity of metagenomes requires efficient and automatic tools that orchestrate the different stages. METHOD: DATMA is a pipeline for fast metagenomic analysis that orchestrates the following: sequencing quality control, 16S rRNA-identification, reads binning, de novo assembly and evaluation, gene prediction, and taxonomic annotation. Its distributed computing model can use multiple computing resources to reduce the analysis time. RESULTS: We used a controlled experiment to show DATMA functionality. Two pre-annotated metagenomes to compare its accuracy and speed against other metagenomic frameworks. Then, with DATMA we recovered a draft genome of a novel Anaerolineaceae from a biosolid metagenome. CONCLUSIONS: DATMA is a bioinformatics tool that automatically analyzes complex metagenomes. It is faster than similar tools and, in some cases, it can extract genomes that the other tools do not. DATMA is freely available at https://github.com/andvides/DATMA.

9.
Brief Bioinform ; 20(5): 1795-1811, 2019 09 27.
Artigo em Inglês | MEDLINE | ID: mdl-30084865

RESUMO

There has been an exponential growth in the performance and output of sequencing technologies (omics data) with full genome sequencing now producing gigabases of reads on a daily basis. These data may hold the promise of personalized medicine, leading to routinely available sequencing tests that can guide patient treatment decisions. In the era of high-throughput sequencing (HTS), computational considerations, data governance and clinical translation are the greatest rate-limiting steps. To ensure that the analysis, management and interpretation of such extensive omics data is exploited to its full potential, key factors, including sample sourcing, technology selection and computational expertise and resources, need to be considered, leading to an integrated set of high-performance tools and systems. This article provides an up-to-date overview of the evolution of HTS and the accompanying tools, infrastructure and data management approaches that are emerging in this space, which, if used within in a multidisciplinary context, may ultimately facilitate the development of personalized medicine.


Assuntos
Pesquisa Biomédica , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Medicina de Precisão , Computação em Nuvem , Biologia Computacional , Segurança Computacional , Ética
10.
J Struct Biol X ; 1: 100006, 2019.
Artigo em Inglês | MEDLINE | ID: mdl-32647812

RESUMO

The West-Life project (https://about.west-life.eu/) is a Horizon 2020 project funded by the European Commission to provide data processing and data management services for the international community of structural biologists, and in particular to support integrative experimental approaches within the field of structural biology. It has developed enhancements to existing web services for structure solution and analysis, created new pipelines to link these services into more complex higher-level workflows, and added new data management facilities. Through this work it has striven to make the benefits of European e-Infrastructures more accessible to life-science researchers in general and structural biologists in particular.

11.
Expert Opin Drug Discov ; 14(1): 9-22, 2019 01.
Artigo em Inglês | MEDLINE | ID: mdl-30484337

RESUMO

INTRODUCTION: Computational chemistry dramatically accelerates the drug discovery process and high-performance computing (HPC) can be used to speed up the most expensive calculations. Supporting a local HPC infrastructure is both costly and time-consuming, and, therefore, many research groups are moving from in-house solutions to remote-distributed computing platforms. Areas covered: The authors focus on the use of distributed technologies, solutions, and infrastructures to gain access to HPC capabilities, software tools, and datasets to run the complex simulations required in computational drug discovery (CDD). Expert opinion: The use of computational tools can decrease the time to market of new drugs. HPC has a crucial role in handling the complex algorithms and large volumes of data required to achieve specificity and avoid undesirable side-effects. Distributed computing environments have clear advantages over in-house solutions in terms of cost and sustainability. The use of infrastructures relying on virtualization reduces set-up costs. Distributed computing resources can be difficult to access, although web-based solutions are becoming increasingly available. There is a trade-off between cost-effectiveness and accessibility in using on-demand computing resources rather than free/academic resources. Graphics processing unit computing, with its outstanding parallel computing power, is becoming increasingly important.


Assuntos
Química Computacional/métodos , Simulação por Computador , Descoberta de Drogas/métodos , Algoritmos , Animais , Metodologias Computacionais , Humanos , Software , Fatores de Tempo
12.
Proc IEEE Int Conf Cloud Eng ; 2017: 127-137, 2017 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-28884169

RESUMO

Traditional in-house, laboratory-based medical imaging studies use hierarchical data structures (e.g., NFS file stores) or databases (e.g., COINS, XNAT) for storage and retrieval. The resulting performance from these approaches is, however, impeded by standard network switches since they can saturate network bandwidth during transfer from storage to processing nodes for even moderate-sized studies. To that end, a cloud-based "medical image processing-as-a-service" offers promise in utilizing the ecosystem of Apache Hadoop, which is a flexible framework providing distributed, scalable, fault tolerant storage and parallel computational modules, and HBase, which is a NoSQL database built atop Hadoop's distributed file system. Despite this promise, HBase's load distribution strategy of region split and merge is detrimental to the hierarchical organization of imaging data (e.g., project, subject, session, scan, slice). This paper makes two contributions to address these concerns by describing key cloud engineering principles and technology enhancements we made to the Apache Hadoop ecosystem for medical imaging applications. First, we propose a row-key design for HBase, which is a necessary step that is driven by the hierarchical organization of imaging data. Second, we propose a novel data allocation policy within HBase to strongly enforce collocation of hierarchically related imaging data. The proposed enhancements accelerate data processing by minimizing network usage and localizing processing to machines where the data already exist. Moreover, our approach is amenable to the traditional scan, subject, and project-level analysis procedures, and is compatible with standard command line/scriptable image processing software. Experimental results for an illustrative sample of imaging data reveals that our new HBase policy results in a three-fold time improvement in conversion of classic DICOM to NiFTI file formats when compared with the default HBase region split policy, and nearly a six-fold improvement over a commonly available network file system (NFS) approach even for relatively small file sets. Moreover, file access latency is lower than network attached storage.

13.
Neuroinformatics ; 15(1): 51-70, 2017 01.
Artigo em Inglês | MEDLINE | ID: mdl-27655341

RESUMO

Simulations in neuroscience are performed on local servers or High Performance Computing (HPC) facilities. Recently, cloud computing has emerged as a potential computational platform for neuroscience simulation. In this paper we compare and contrast HPC and cloud resources for scientific computation, then report how we deployed NEURON, a widely used simulator of neuronal activity, in three clouds: Chameleon Cloud, a hybrid private academic cloud for cloud technology research based on the OpenStack software; Rackspace, a public commercial cloud, also based on OpenStack; and Amazon Elastic Cloud Computing, based on Amazon's proprietary software. We describe the manual procedures and how to automate cloud operations. We describe extending our simulation automation software called NeuroManager (Stockton and Santamaria, Frontiers in Neuroinformatics, 2015), so that the user is capable of recruiting private cloud, public cloud, HPC, and local servers simultaneously with a simple common interface. We conclude by performing several studies in which we examine speedup, efficiency, total session time, and cost for sets of simulations of a published NEURON model.


Assuntos
Computação em Nuvem , Simulação por Computador , Metodologias Computacionais , Software , Algoritmos , Humanos , Internet , Neurônios/fisiologia , Interface Usuário-Computador
14.
Front Neuroinform ; 10: 38, 2016.
Artigo em Inglês | MEDLINE | ID: mdl-27610080

RESUMO

pypet (Python parameter exploration toolkit) is a new multi-platform Python toolkit for managing numerical simulations. Sampling the space of model parameters is a key aspect of simulations and numerical experiments. pypet is designed to allow easy and arbitrary sampling of trajectories through a parameter space beyond simple grid searches. pypet collects and stores both simulation parameters and results in a single HDF5 file. This collective storage allows fast and convenient loading of data for further analyses. pypet provides various additional features such as multiprocessing and parallelization of simulations, dynamic loading of data, integration of git version control, and supervision of experiments via the electronic lab notebook Sumatra. pypet supports a rich set of data formats, including native Python types, Numpy and Scipy data, Pandas DataFrames, and BRIAN(2) quantities. Besides these formats, users can easily extend the toolkit to allow customized data types. pypet is a flexible tool suited for both short Python scripts and large scale projects. pypet's various features, especially the tight link between parameters and results, promote reproducible research in computational neuroscience and simulation-based disciplines.

15.
PeerJ ; 4: e2248, 2016.
Artigo em Inglês | MEDLINE | ID: mdl-27547555

RESUMO

Development of high-throughput technologies, such as Next-generation sequencing, allows thousands of experiments to be performed simultaneously while reducing resource requirement. Consequently, a massive amount of experiment data is now rapidly generated. Nevertheless, the data are not readily usable or meaningful until they are further analysed and interpreted. Due to the size of the data, a high performance computer (HPC) is required for the analysis and interpretation. However, the HPC is expensive and difficult to access. Other means were developed to allow researchers to acquire the power of HPC without a need to purchase and maintain one such as cloud computing services and grid computing system. In this study, we implemented grid computing in a computer training center environment using Berkeley Open Infrastructure for Network Computing (BOINC) as a job distributor and data manager combining all desktop computers to virtualize the HPC. Fifty desktop computers were used for setting up a grid system during the off-hours. In order to test the performance of the grid system, we adapted the Basic Local Alignment Search Tools (BLAST) to the BOINC system. Sequencing results from Illumina platform were aligned to the human genome database by BLAST on the grid system. The result and processing time were compared to those from a single desktop computer and HPC. The estimated durations of BLAST analysis for 4 million sequence reads on a desktop PC, HPC and the grid system were 568, 24 and 5 days, respectively. Thus, the grid implementation of BLAST by BOINC is an efficient alternative to the HPC for sequence alignment. The grid implementation by BOINC also helped tap unused computing resources during the off-hours and could be easily modified for other available bioinformatics software.

16.
J Comput Aided Mol Des ; 30(7): 541-52, 2016 07.
Artigo em Inglês | MEDLINE | ID: mdl-27438595

RESUMO

The trypanosomatid protozoa Leishmania is endemic in ~100 countries, with infections causing ~2 million new cases of leishmaniasis annually. Disease symptoms can include severe skin and mucosal ulcers, fever, anemia, splenomegaly, and death. Unfortunately, therapeutics approved to treat leishmaniasis are associated with potentially severe side effects, including death. Furthermore, drug-resistant Leishmania parasites have developed in most endemic countries. To address an urgent need for new, safe and inexpensive anti-leishmanial drugs, we utilized the IBM World Community Grid to complete computer-based drug discovery screens (Drug Search for Leishmaniasis) using unique leishmanial proteins and a database of 600,000 drug-like small molecules. Protein structures from different Leishmania species were selected for molecular dynamics (MD) simulations, and a series of conformational "snapshots" were chosen from each MD trajectory to simulate the protein's flexibility. A Relaxed Complex Scheme methodology was used to screen ~2000 MD conformations against the small molecule database, producing >1 billion protein-ligand structures. For each protein target, a binding spectrum was calculated to identify compounds predicted to bind with highest average affinity to all protein conformations. Significantly, four different Leishmania protein targets were predicted to strongly bind small molecules, with the strongest binding interactions predicted to occur for dihydroorotate dehydrogenase (LmDHODH; PDB:3MJY). A number of predicted tight-binding LmDHODH inhibitors were tested in vitro and potent selective inhibitors of Leishmania panamensis were identified. These promising small molecules are suitable for further development using iterative structure-based optimization and in vitro/in vivo validation assays.


Assuntos
Antiprotozoários/química , Leishmaniose/tratamento farmacológico , Oxirredutases atuantes sobre Doadores de Grupo CH-CH/química , Proteínas de Protozoários/química , Bibliotecas de Moléculas Pequenas/química , Antiprotozoários/uso terapêutico , Di-Hidro-Orotato Desidrogenase , Humanos , Leishmania/química , Leishmania/efeitos dos fármacos , Leishmaniose/parasitologia , Ligantes , Simulação de Dinâmica Molecular , Oxirredutases atuantes sobre Doadores de Grupo CH-CH/efeitos dos fármacos , Ligação Proteica/efeitos dos fármacos , Proteínas de Protozoários/efeitos dos fármacos , Bibliotecas de Moléculas Pequenas/uso terapêutico , Interface Usuário-Computador
17.
J Comput Aided Mol Des ; 30(3): 237-49, 2016 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-26897747

RESUMO

Large-scale computing technologies have enabled high-throughput virtual screening involving thousands to millions of drug candidates. It is not trivial, however, for biochemical scientists to evaluate the technical alternatives and their implications for running such large experiments. Besides experience with the molecular docking tool itself, the scientist needs to learn how to run it on high-performance computing (HPC) infrastructures, and understand the impact of the choices made. Here, we review such considerations for a specific tool, AutoDock Vina, and use experimental data to illustrate the following points: (1) an additional level of parallelization increases virtual screening throughput on a multi-core machine; (2) capturing of the random seed is not enough (though necessary) for reproducibility on heterogeneous distributed computing systems; (3) the overall time spent on the screening of a ligand library can be improved by analysis of factors affecting execution time per ligand, including number of active torsions, heavy atoms and exhaustiveness. We also illustrate differences among four common HPC infrastructures: grid, Hadoop, small cluster and multi-core (virtual machine on the cloud). Our analysis shows that these platforms are suitable for screening experiments of different sizes. These considerations can guide scientists when choosing the best computing platform and set-up for their future large virtual screening experiments.


Assuntos
Desenho Assistido por Computador , Descoberta de Drogas , Software , Desenho Assistido por Computador/economia , Bases de Dados de Produtos Farmacêuticos , Descoberta de Drogas/economia , Descoberta de Drogas/métodos , Humanos , Ligantes , Simulação de Acoplamento Molecular , Membro 1 do Grupo A da Subfamília 4 de Receptores Nucleares/metabolismo , Proteínas/metabolismo , Reprodutibilidade dos Testes , Software/economia , Interface Usuário-Computador
18.
J Mol Biol ; 428(4): 720-725, 2016 Feb 22.
Artigo em Inglês | MEDLINE | ID: mdl-26410586

RESUMO

The prediction of the quaternary structure of biomolecular macromolecules is of paramount importance for fundamental understanding of cellular processes and drug design. In the era of integrative structural biology, one way of increasing the accuracy of modeling methods used to predict the structure of biomolecular complexes is to include as much experimental or predictive information as possible in the process. This has been at the core of our information-driven docking approach HADDOCK. We present here the updated version 2.2 of the HADDOCK portal, which offers new features such as support for mixed molecule types, additional experimental restraints and improved protocols, all of this in a user-friendly interface. With well over 6000 registered users and 108,000 jobs served, an increasing fraction of which on grid resources, we hope that this timely upgrade will help the community to solve important biological questions and further advance the field. The HADDOCK2.2 Web server is freely accessible to non-profit users at http://haddock.science.uu.nl/services/HADDOCK2.2.


Assuntos
Biologia Computacional/métodos , Substâncias Macromoleculares/química , Biologia Molecular/métodos , Internet
19.
J Adv Res ; 6(6): 987-93, 2015 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-26644937

RESUMO

Scheduling tasks on heterogeneous resources distributed over a grid computing system is an NP-complete problem. The main aim for several researchers is to develop variant scheduling algorithms for achieving optimality, and they have shown a good performance for tasks scheduling regarding resources selection. However, using of the full power of resources is still a challenge. In this paper, a new heuristic algorithm called Sort-Mid is proposed. It aims to maximizing the utilization and minimizing the makespan. The new strategy of Sort-Mid algorithm is to find appropriate resources. The base step is to get the average value via sorting list of completion time of each task. Then, the maximum average is obtained. Finally, the task has the maximum average is allocated to the machine that has the minimum completion time. The allocated task is deleted and then, these steps are repeated until all tasks are allocated. Experimental tests show that the proposed algorithm outperforms almost other algorithms in terms of resources utilization and makespan.

20.
Adv Appl Bioinform Chem ; 8: 23-35, 2015.
Artigo em Inglês | MEDLINE | ID: mdl-26604801

RESUMO

Today's genomic experiments have to process the so-called "biological big data" that is now reaching the size of Terabytes and Petabytes. To process this huge amount of data, scientists may require weeks or months if they use their own workstations. Parallelism techniques and high-performance computing (HPC) environments can be applied for reducing the total processing time and to ease the management, treatment, and analyses of this data. However, running bioinformatics experiments in HPC environments such as clouds, grids, clusters, and graphics processing unit requires the expertise from scientists to integrate computational, biological, and mathematical techniques and technologies. Several solutions have already been proposed to allow scientists for processing their genomic experiments using HPC capabilities and parallelism techniques. This article brings a systematic review of literature that surveys the most recently published research involving genomics and parallel computing. Our objective is to gather the main characteristics, benefits, and challenges that can be considered by scientists when running their genomic experiments to benefit from parallelism techniques and HPC capabilities.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...