Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 14 de 14
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
Int J High Perform Comput Appl ; 37(1): 45-57, 2023 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-38603271

RESUMO

As a theoretically rigorous and accurate method, FEP-ABFE (Free Energy Perturbation-Absolute Binding Free Energy) calculations showed great potential in drug discovery, but its practical application was difficult due to high computational cost. To rapidly discover antiviral drugs targeting SARS-CoV-2 Mpro and TMPRSS2, we performed FEP-ABFE-based virtual screening for ∼12,000 protein-ligand binding systems on a new generation of Tianhe supercomputer. A task management tool was specifically developed for automating the whole process involving more than 500,000 MD tasks. In further experimental validation, 50 out of 98 tested compounds showed significant inhibitory activity towards Mpro, and one representative inhibitor, dipyridamole, showed remarkable outcomes in subsequent clinical trials. This work not only demonstrates the potential of FEP-ABFE in drug discovery but also provides an excellent starting point for further development of anti-SARS-CoV-2 drugs. Besides, ∼500 TB of data generated in this work will also accelerate the further development of FEP-related methods.

2.
J Comput Chem ; 43(2): 144-154, 2022 01 15.
Artigo em Inglês | MEDLINE | ID: mdl-34747038

RESUMO

Biochemical simuflation and analysis play a significant role in systems biology research. Numerous software tools have been developed to serve this area. Using these tools for completing tasks, for example, stochastic simulation, parameter fitting and optimization, usually requires sufficient computational power to make the duration of completion acceptable. COPASI is one of the most powerful tools for quantitative simulation and analysis targeted at biological systems. It supports systems biology markup language and covers multiple categories of tasks. This work develops an open source package ParaCopasi for parallel COPASI tasks and investigates its performance regarding accelerations. ParaCopasi can be installed on platforms equipped with multicore CPU to exploit the cores, scaling from desktop computers to large scale high-performance computing clusters. More cores bring more performance. The results show that the parallel efficiency has a positive correlation with the total workload. The parallel efficiency reaches a level of at least 95% for both homogeneous and heterogenous tasks when computational workload is adequate. An example is illustrated by applicating this package in parameter estimation to calibrate a biochemical kinetics model.

3.
Interdiscip Sci ; 14(1): 1-14, 2022 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-34487327

RESUMO

The rapid advances in sequencing technology have led to an explosion of sequence data. Sequence alignment is the central and fundamental problem in many sequence analysis procedure, while local alignment is often the kernel of these algorithms. Usually, Smith-Waterman algorithm is used to find the best subsequence match between given sequences. However, the high time complexity makes the algorithm time-consuming. A lot of approaches have been developed to accelerate and parallelize it, such as vector-level parallelization, thread-level parallelization, process-level parallelization, and heterogeneous acceleration, but the current researches seem unsystematic, which hinders the further research of parallelizing the algorithm. In this paper, we summarize the current research status of parallel local alignments and describe the data layout in these work. Based on the research status, we emphasize large-scale genomic comparisons. By surveying some typical alignment tools' performance, we discuss some possible directions in the future. We hope our work will provide the developers of the alignment tool with technical principle support, and help researchers choose proper alignment tools.


Assuntos
Algoritmos , Software , Genômica , Alinhamento de Sequência , Análise de Sequência/métodos
4.
BMC Bioinformatics ; 22(1): 432, 2021 Sep 10.
Artigo em Inglês | MEDLINE | ID: mdl-34507528

RESUMO

BACKGROUND: Interactions of microbes and diseases are of great importance for biomedical research. However, large-scale of microbe-disease interactions are hidden in the biomedical literature. The structured databases for microbe-disease interactions are in limited amounts. In this paper, we aim to construct a large-scale database for microbe-disease interactions automatically. We attained this goal via applying text mining methods based on a deep learning model with a moderate curation cost. We also built a user-friendly web interface that allows researchers to navigate and query required information. RESULTS: Firstly, we manually constructed a golden-standard corpus and a sliver-standard corpus (SSC) for microbe-disease interactions for curation. Moreover, we proposed a text mining framework for microbe-disease interaction extraction based on a pretrained model BERE. We applied named entity recognition tools to detect microbe and disease mentions from the free biomedical texts. After that, we fine-tuned the pretrained model BERE to recognize relations between targeted entities, which was originally built for drug-target interactions or drug-drug interactions. The introduction of SSC for model fine-tuning greatly improved detection performance for microbe-disease interactions, with an average reduction in error of approximately 10%. The MDIDB website offers data browsing, custom searching for specific diseases or microbes, and batch downloading. CONCLUSIONS: Evaluation results demonstrate that our method outperform the baseline model (rule-based PKDE4J) with an average [Formula: see text]-score of 73.81%. For further validation, we randomly sampled nearly 1000 predicted interactions by our model, and manually checked the correctness of each interaction, which gives a 73% accuracy. The MDIDB webiste is freely avaliable throuth http://dbmdi.com/index/.


Assuntos
Pesquisa Biomédica , Mineração de Dados , Aprendizado de Máquina , Publicações
5.
BMC Bioinformatics ; 22(1): 344, 2021 Jun 24.
Artigo em Inglês | MEDLINE | ID: mdl-34167459

RESUMO

BACKGROUND: VISPR is an interactive visualization and analysis framework for CRISPR screening experiments. However, it only supports the output of MAGeCK, and requires installation and manual configuration. Furthermore, VISPR is designed to run on a single computer, and data sharing between collaborators is challenging. RESULTS: To make the tool easily accessible to the community, we present VISPR-online, a web-based general application allowing users to visualize, explore, and share CRISPR screening data online with a few simple steps. VISPR-online provides an exploration of screening results and visualization of read count changes. Apart from MAGeCK, VISPR-online supports two more popular CRISPR screening analysis tools: BAGEL and JACKS. It provides an interactive environment for exploring gene essentiality, viewing guide RNA (gRNA) locations, and allowing users to resume and share screening results. CONCLUSIONS: VISPR-online allows users to visualize, explore and share CRISPR screening data online. It is freely available at http://vispr-online.weililab.org , while the source code is available at https://github.com/lemoncyb/VISPR-online .


Assuntos
Repetições Palindrômicas Curtas Agrupadas e Regularmente Espaçadas , Software , Internet , RNA Guia de Cinetoplastídeos , Pesquisa
6.
BMC Bioinformatics ; 21(1): 544, 2020 Nov 26.
Artigo em Inglês | MEDLINE | ID: mdl-33243142

RESUMO

BACKGROUND: Elucidation of interactive relation between chemicals and genes is of key relevance not only for discovering new drug leads in drug development but also for repositioning existing drugs to novel therapeutic targets. Recently, biological network-based approaches have been proven to be effective in predicting chemical-gene interactions. RESULTS: We present CGINet, a graph convolutional network-based method for identifying chemical-gene interactions in an integrated multi-relational graph containing three types of nodes: chemicals, genes, and pathways. We investigate two different perspectives on learning node embeddings. One is to view the graph as a whole, and the other is to adopt a subgraph view that initial node embeddings are learned from the binary association subgraphs and then transferred to the multi-interaction subgraph for more focused learning of higher-level target node representations. Besides, we reconstruct the topological structures of target nodes with the latent links captured by the designed substructures. CGINet adopts an end-to-end way that the encoder and the decoder are trained jointly with known chemical-gene interactions. We aim to predict unknown but potential associations between chemicals and genes as well as their interaction types. CONCLUSIONS: We study three model implementations CGINet-1/2/3 with various components and compare them with baseline approaches. As the experimental results suggest, our models exhibit competitive performances on identifying chemical-gene interactions. Besides, the subgraph perspective and the latent link both play positive roles in learning much more informative node embeddings and can lead to improved prediction.


Assuntos
Algoritmos , Epistasia Genética , Modelos Genéticos , Redes Neurais de Computação , Área Sob a Curva , Redes Reguladoras de Genes , Humanos
7.
Burns ; 46(8): 1829-1838, 2020 12.
Artigo em Inglês | MEDLINE | ID: mdl-32826097

RESUMO

INTRODUCTION: Early judgment of the depth of burns is very important for the accurate formulation of treatment plans. In medical imaging the application of Artificial Intelligence has the potential for serving as a very experienced assistant to improve early clinical diagnosis. Due to lack of large volume of a particular feature, there has been almost no progress in burn field. METHODS: 484 early wound images are collected on patients who discharged home after a burn injury in 48 h, from five different levels of hospitals in Hunan Province China. According to actual healing time, all images are manually annotated by five professional burn surgeons and divided into three sets which are shallow(0-10 days), moderate(11-20 days) and deep(more than 21 days or skin graft healing). These ROIs were further divided into 5637 patches sizes 224 × 224 pixels, of which 1733 shallow, 1804 moderate, and 2100 deep. We used transfer learning suing a Pre-trained ResNet50 model and the ratio of all images is 7:1.5:1.5 for training:validation:test. RESULTS: A novel artificial burn depth recognition model based on convolutional neural network was established and the diagnostic accuracy of the three types of burns is about 80%. DISCUSSION: The actual healing time can be used to deduce the depth of burn involvement. The artificial burn depth recognition model can accurately infer healing time and burn depth of the patient, which is expected to be used for auxiliary diagnosis improvement.


Assuntos
Queimaduras/classificação , Queimaduras/diagnóstico por imagem , Sistemas Computacionais/normas , Adulto , Queimaduras/epidemiologia , China/epidemiologia , Sistemas Computacionais/estatística & dados numéricos , Humanos , Fatores de Tempo , Cicatrização/fisiologia
8.
BMC Genomics ; 21(Suppl 1): 872, 2020 Mar 05.
Artigo em Inglês | MEDLINE | ID: mdl-32138651

RESUMO

BACKGROUND: The Type II clustered regularly interspaced short palindromic repeats (CRISPR) and CRISPR-associated proteins (Cas) is a powerful genome editing technology, which is more and more popular in gene function analysis. In CRISPR/Cas, RNA guides Cas nuclease to the target site to perform DNA modification. RESULTS: The performance of CRISPR/Cas depends on well-designed single guide RNA (sgRNA). However, the off-target effect of sgRNA leads to undesired mutations in genome and limits the use of CRISPR/Cas. Here, we present OffScan, a universal and fast CRISPR off-target detection tool. CONCLUSIONS: OffScan is not limited by the number of mismatches and allows custom protospacer-adjacent motif (PAM), which is the target site by Cas protein. Besides, OffScan adopts the FM-index, which efficiently improves query speed and reduce memory consumption.


Assuntos
Sistemas CRISPR-Cas , Biologia Computacional/métodos , Edição de Genes/métodos , RNA Guia de Cinetoplastídeos/genética , Algoritmos , Animais , Caenorhabditis elegans/genética , Repetições Palindrômicas Curtas Agrupadas e Regularmente Espaçadas , Endonucleases/metabolismo , Humanos , Camundongos , Mutação , Peixe-Zebra/genética
9.
Artigo em Inglês | MEDLINE | ID: mdl-29994638

RESUMO

Molecular dynamics (MD) is a computer simulation method of studying physical movements of atoms and molecules that provide detailed microscopic sampling on molecular scale. With the continuous efforts and improvements, MD simulation gained popularity in materials science, biochemistry and biophysics with various application areas and expanding data scale. Assisted Model Building with Energy Refinement (AMBER) is one of the most widely used software packages for conducting MD simulations. However, the speed of AMBER MD simulations for system with millions of atoms in microsecond scale still need to be improved. In this paper, we propose a parallel acceleration strategy for AMBER on the Tianhe-2 supercomputer. The parallel optimization of AMBER is carried out on three different levels: fine grained OpenMP parallel on a single CPU, single node CPU/MIC parallel optimization and multi-node multi-MIC collaborated parallel acceleration. By the three levels of parallel acceleration strategy above, we achieved the highest speedup of 25-33 times compared with the original program.


Assuntos
Biologia Computacional/instrumentação , Biologia Computacional/métodos , Simulação de Dinâmica Molecular , Algoritmos , Computadores
10.
Sensors (Basel) ; 19(9)2019 May 05.
Artigo em Inglês | MEDLINE | ID: mdl-31060279

RESUMO

When measurement rates grow, most Compressive Sensing (CS) methods suffer from an increase in overheads of transmission and storage of CS measurements, while reconstruction quality degrades appreciably when measurement rates reduce. To solve these problems in real scenarios such as large-scale distributed surveillance systems, we propose a low-cost image CS approach called MRCS for object detection. It predicts key objects using the proposed MYOLO3 detector, and then samples the regions of the key objects as well as other regions using multiple measurement rates to reduce the size of sampled CS measurements. It also stores and transmits half-precision CS measurements to further reduce the required transmission bandwidth and storage space. Comprehensive evaluations demonstrate that MYOLO3 is a smaller and improved object detector for resource-limited hardware devices such as surveillance cameras and aerial drones. They also suggest that MRCS significantly reduces the required transmission bandwidth and storage space by declining the size of CS measurements, e.g., mean Compression Ratios (mCR) achieves 1.43-22.92 on the VOC-pbc dataset. Notably, MRCS further reduces the size of CS measurements by half-precision representations. Subsequently, the required transmission bandwidth and storage space are reduced by one half as compared to the counterparts represented with single-precision floats. Moreover, it also substantially enhances the usability of object detection on reconstructed images with half-precision CS measurements and multiple measurement rates as compared to its counterpart, using a single low measurement rate.

11.
BMC Syst Biol ; 12(Suppl 6): 111, 2018 11 22.
Artigo em Inglês | MEDLINE | ID: mdl-30463619

RESUMO

BACKGROUND: While there are a large number of bioinformatics datasets for clustering, many of them are incomplete, i.e., missing attribute values in some data samples needed by clustering algorithms. A variety of clustering algorithms have been proposed in the past years, but they usually are limited to cluster on the complete dataset. Besides, conventional clustering algorithms cannot obtain a trade-off between accuracy and efficiency of the clustering process since many essential parameters are determined by the human user's experience. RESULTS: The paper proposes a Multiple Kernel Density Clustering algorithm for Incomplete datasets called MKDCI. The MKDCI algorithm consists of recovering missing attribute values of input data samples, learning an optimally combined kernel for clustering the input dataset, reducing dimensionality with the optimal kernel based on multiple basis kernels, detecting cluster centroids with the Isolation Forests method, assigning clusters with arbitrary shape and visualizing the results. CONCLUSIONS: Extensive experiments on several well-known clustering datasets in bioinformatics field demonstrate the effectiveness of the proposed MKDCI algorithm. Compared with existing density clustering algorithms and parameter-free clustering algorithms, the proposed MKDCI algorithm tends to automatically produce clusters of better quality on the incomplete dataset in bioinformatics.


Assuntos
Biologia Computacional/métodos , Algoritmos , Análise por Conglomerados , Aprendizado de Máquina não Supervisionado
12.
Polymers (Basel) ; 10(4)2018 Apr 01.
Artigo em Inglês | MEDLINE | ID: mdl-30966421

RESUMO

Simulating the rheological behaviors of polymer solutions is intrinsically a multi-scale problem. To study the macroscopic and microscopic characteristics in the fluid flow of dilute polymer solutions, we designed a multi-scale solver, which couples the Brownian Configuration Fields with the macroscopic hydrodynamic governing equations. Numerical simulation results using the multi-scale solver exhibited good accordance with the macroscopic only approach. Through a scalar field D we also quantitatively studied the flow behaviours in 2D planar channels, and analyzed the correlation between the molecular distribution and the macroscopic fluid flow in polymer solutions. Our results verified the correctness of the solver, which could provide valuable guidance for multi-scale simulations of complex fluids based on OpenFOAM.

13.
BMC Genomics ; 18(Suppl 2): 134, 2017 03 14.
Artigo em Inglês | MEDLINE | ID: mdl-28361696

RESUMO

BACKGROUND: The increasing studies have been conducted using whole genome DNA methylation detection as one of the most important part of epigenetics research to find the significant relationships among DNA methylation and several typical diseases, such as cancers and diabetes. In many of those studies, mapping the bisulfite treated sequence to the whole genome has been the main method to study DNA cytosine methylation. However, today's relative tools almost suffer from inaccuracies and time-consuming problems. RESULTS: In our study, we designed a new DNA methylation prediction tool ("Hint-Hunt") to solve the problem. By having an optimal complex alignment computation and Smith-Waterman matrix dynamic programming, Hint-Hunt could analyze and predict the DNA methylation status. But when Hint-Hunt tried to predict DNA methylation status with large-scale dataset, there are still slow speed and low temporal-spatial efficiency problems. In order to solve the problems of Smith-Waterman dynamic programming and low temporal-spatial efficiency, we further design a deep parallelized whole genome DNA methylation detection tool ("P-Hint-Hunt") on Tianhe-2 (TH-2) supercomputer. CONCLUSIONS: To the best of our knowledge, P-Hint-Hunt is the first parallel DNA methylation detection tool with a high speed-up to process large-scale dataset, and could run both on CPU and Intel Xeon Phi coprocessors. Moreover, we deploy and evaluate Hint-Hunt and P-Hint-Hunt on TH-2 supercomputer in different scales. The experimental results illuminate our tools eliminate the deviation caused by bisulfite treatment in mapping procedure and the multi-level parallel program yields a 48 times speed-up with 64 threads. P-Hint-Hunt gain a deep acceleration on CPU and Intel Xeon Phi heterogeneous platform, which gives full play of the advantages of multi-cores (CPU) and many-cores (Phi).


Assuntos
Mapeamento Cromossômico/métodos , Biologia Computacional/métodos , Metilação de DNA , Epigênese Genética , Software , Sequência de Aminoácidos , Sequência de Bases , Citosina/metabolismo , Genoma Humano , Humanos , Alinhamento de Sequência , Análise de Sequência de DNA
14.
BMC Bioinformatics ; 18(Suppl 16): 578, 2017 12 28.
Artigo em Inglês | MEDLINE | ID: mdl-29297301

RESUMO

BACKGROUND: Drug-drug interaction extraction (DDI) needs assistance from automated methods to address the explosively increasing biomedical texts. In recent years, deep neural network based models have been developed to address such needs and they have made significant progress in relation identification. METHODS: We propose a dependency-based deep neural network model for DDI extraction. By introducing the dependency-based technique to a bi-directional long short term memory network (Bi-LSTM), we build three channels, namely, Linear channel, DFS channel and BFS channel. All of these channels are constructed with three network layers, including embedding layer, LSTM layer and max pooling layer from bottom up. In the embedding layer, we extract two types of features, one is distance-based feature and another is dependency-based feature. In the LSTM layer, a Bi-LSTM is instituted in each channel to better capture relation information. Then max pooling is used to get optimal features from the entire encoding sequential data. At last, we concatenate the outputs of all channels and then link it to the softmax layer for relation identification. RESULTS: To the best of our knowledge, our model achieves new state-of-the-art performance with the F-score of 72.0% on the DDIExtraction 2013 corpus. Moreover, our approach obtains much higher Recall value compared to the existing methods. CONCLUSIONS: The dependency-based Bi-LSTM model can learn effective relation information with less feature engineering in the task of DDI extraction. Besides, the experimental results show that our model excels at balancing the Precision and Recall values.


Assuntos
Algoritmos , Interações Medicamentosas , Redes Neurais de Computação , Bases de Dados como Assunto , Estatística como Assunto
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...