Pesquisa | Portal Regional da BVS (teste)

SMAT: An attention-based deep learning solution to the automation of schema matching.

Zhang, Jing; Shin, Bonggun; Choi, Jinho D; Ho, Joyce C.

Adv Databases Inf Syst ; 12843: 260-274, 2021 Aug.

Artigo em Inglês | MEDLINE | ID: mdl-34608464

RESUMO

Schema matching aims to identify the correspondences among attributes of database schemas. It is frequently considered as the most challenging and decisive stage existing in many contemporary web semantics and database systems. Low-quality algorithmic matchers fail to provide improvement while manually annotation consumes extensive human efforts. Further complications arise from data privacy in certain domains such as healthcare, where only schema-level matching should be used to prevent data leakage. For this problem, we propose SMAT, a new deep learning model based on state-of-the-art natural language processing techniques to obtain semantic mappings between source and target schemas using only the attribute name and description. SMAT avoids directly encoding domain knowledge about the source and target systems, which allows it to be more easily deployed across different sites. We also introduce a new benchmark dataset, OMAP, based on real-world schema-level mappings from the healthcare domain. Our extensive evaluation of various benchmark datasets demonstrates the potential of SMAT to help automate schema-level matching tasks.

Controlled Molecule Generator for Optimizing Multiple Chemical Properties.

Shin, Bonggun; Park, Sungsoo; Bak, JinYeong; Ho, Joyce C.

ACM CHIL 2021 (2021) ; 2021: 146-153, 2021 Apr.

Artigo em Inglês | MEDLINE | ID: mdl-35194593

RESUMO

Generating a novel and optimized molecule with desired chemical properties is an essential part of the drug discovery process. Failure to meet one of the required properties can frequently lead to failure in a clinical test which is costly. In addition, optimizing these multiple properties is a challenging task because the optimization of one property is prone to changing other properties. In this paper, we pose this multi-property optimization problem as a sequence translation process and propose a new optimized molecule generator model based on the Transformer with two constraint networks: property prediction and similarity prediction. We further improve the model by incorporating score predictions from these constraint networks in a modified beam search algorithm. The experiments demonstrate that our proposed model, Controlled Molecule Generator (CMG), outperforms state-of-the-art models by a significant margin for optimizing multiple properties simultaneously.

Target-Centered Drug Repurposing Predictions of Human Angiotensin-Converting Enzyme 2 (ACE2) and Transmembrane Protease Serine Subtype 2 (TMPRSS2) Interacting Approved Drugs for Coronavirus Disease 2019 (COVID-19) Treatment through a Drug-Target Interaction Deep Learning Model.

Choi, Yoonjung; Shin, Bonggun; Kang, Keunsoo; Park, Sungsoo; Beck, Bo Ram.

Viruses ; 12(11)2020 11 18.

Artigo em Inglês | MEDLINE | ID: mdl-33218024

RESUMO

Previously, our group predicted commercially available Food and Drug Administration (FDA) approved drugs that can inhibit each step of the replication of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) using a deep learning-based drug-target interaction model called Molecule Transformer-Drug Target Interaction (MT-DTI). Unfortunately, additional clinically significant treatment options since the approval of remdesivir are scarce. To overcome the current coronavirus disease 2019 (COVID-19) more efficiently, a treatment strategy that controls not only SARS-CoV-2 replication but also the host entry step should be considered. In this study, we used MT-DTI to predict FDA approved drugs that may have strong affinities for the angiotensin-converting enzyme 2 (ACE2) receptor and the transmembrane protease serine 2 (TMPRSS2) which are essential for viral entry to the host cell. Of the 460 drugs with Kd of less than 100 nM for the ACE2 receptor, 17 drugs overlapped with drugs that inhibit the interaction of ACE2 and SARS-CoV-2 spike reported in the NCATS OpenData portal. Among them, enalaprilat, an ACE inhibitor, showed a Kd value of 1.5 nM against the ACE2. Furthermore, three of the top 30 drugs with strong affinity prediction for the TMPRSS2 are anti-hepatitis C virus (HCV) drugs, including ombitasvir, daclatasvir, and paritaprevir. Notably, of the top 30 drugs, AT1R blocker eprosartan and neuropsychiatric drug lisuride showed similar gene expression profiles to potential TMPRSS2 inhibitors. Collectively, we suggest that drugs predicted to have strong inhibitory potencies to ACE2 and TMPRSS2 through the DTI model should be considered as potential drug repurposing candidates for COVID-19.

Assuntos

Enzima de Conversão de Angiotensina 2/antagonistas & inibidores , Tratamento Farmacológico da COVID-19 , Aprendizado Profundo , Reposicionamento de Medicamentos/métodos , Serina Endopeptidases/metabolismo , Enzima de Conversão de Angiotensina 2/metabolismo , Desenvolvimento de Medicamentos , Hepacivirus/efeitos dos fármacos , Humanos , SARS-CoV-2/efeitos dos fármacos , Internalização do Vírus/efeitos dos fármacos , Replicação Viral/efeitos dos fármacos

Predicting commercially available antiviral drugs that may act on the novel coronavirus (SARS-CoV-2) through a drug-target interaction deep learning model.

Beck, Bo Ram; Shin, Bonggun; Choi, Yoonjung; Park, Sungsoo; Kang, Keunsoo.

Comput Struct Biotechnol J ; 18: 784-790, 2020.

Artigo em Inglês | MEDLINE | ID: mdl-32280433

RESUMO

The infection of a novel coronavirus found in Wuhan of China (SARS-CoV-2) is rapidly spreading, and the incidence rate is increasing worldwide. Due to the lack of effective treatment options for SARS-CoV-2, various strategies are being tested in China, including drug repurposing. In this study, we used our pre-trained deep learning-based drug-target interaction model called Molecule Transformer-Drug Target Interaction (MT-DTI) to identify commercially available drugs that could act on viral proteins of SARS-CoV-2. The result showed that atazanavir, an antiretroviral medication used to treat and prevent the human immunodeficiency virus (HIV), is the best chemical compound, showing an inhibitory potency with Kd of 94.94 nM against the SARS-CoV-2 3C-like proteinase, followed by remdesivir (113.13 nM), efavirenz (199.17 nM), ritonavir (204.05 nM), and dolutegravir (336.91 nM). Interestingly, lopinavir, ritonavir, and darunavir are all designed to target viral proteinases. However, in our prediction, they may also bind to the replication complex components of SARS-CoV-2 with an inhibitory potency with Kd < 1000 nM. In addition, we also found that several antiviral agents, such as Kaletra (lopinavir/ritonavir), could be used for the treatment of SARS-CoV-2. Overall, we suggest that the list of antiviral drugs identified by the MT-DTI model should be considered, when establishing effective treatment strategies for SARS-CoV-2.

Cascaded Wx: A Novel Prognosis-Related Feature Selection Framework in Human Lung Adenocarcinoma Transcriptomes.

Shin, Bonggun; Park, Sungsoo; Hong, Ji Hyung; An, Ho Jung; Chun, Sang Hoon; Kang, Kilsoo; Ahn, Young-Ho; Ko, Yoon Ho; Kang, Keunsoo.

Front Genet ; 10: 662, 2019.

Artigo em Inglês | MEDLINE | ID: mdl-31379926

RESUMO

Artificial neural network-based analysis has recently been used to predict clinical outcomes in patients with solid cancers, including lung cancer. However, the majority of algorithms were not originally developed to identify genes associated with patients' prognoses. To address this issue, we developed a novel prognosis-related feature selection framework called Cascaded Wx (CWx). The CWx framework ranks features according to the survival of a given cohort by training neural networks with three different high- and low-risk groups in a cascaded fashion. We showed that this approach accurately identified features that best identify the patients' prognoses, compared to other feature selection algorithms, including the Cox proportional hazards and Coxnet models, when applied to The Cancer Genome Atlas lung adenocarcinoma (LUAD) transcriptome data. The prognostic potential of the top 100 genes identified by CWx outperformed or was comparable to those identified by the other methods as assessed by the concordance index (c-index). In addition, the top 100 genes identified by CWx were found to be associated with the Wnt signaling pathway, providing biologically relevant evidence for the value of these genes in predicting the prognosis of patients with LUAD. Further analyses of other cancer types showed that the genes identified by CWx had the highest prognostic values according to the c-index. Collectively, the CWx framework will potentially be of great use to prognosis-related biomarker discoveries in a variety of diseases.

Wx: a neural network-based feature selection algorithm for transcriptomic data.

Park, Sungsoo; Shin, Bonggun; Sang Shim, Won; Choi, Yoonjung; Kang, Kilsoo; Kang, Keunsoo.

Sci Rep ; 9(1): 10500, 2019 07 19.

Artigo em Inglês | MEDLINE | ID: mdl-31324856

RESUMO

Next-generation sequencing (NGS), which allows the simultaneous sequencing of billions of DNA fragments simultaneously, has revolutionized how we study genomics and molecular biology by generating genome-wide molecular maps of molecules of interest. However, the amount of information produced by NGS has made it difficult for researchers to choose the optimal set of genes. We have sought to resolve this issue by developing a neural network-based feature (gene) selection algorithm called Wx. The Wx algorithm ranks genes based on the discriminative index (DI) score that represents the classification power for distinguishing given groups. With a gene list ranked by DI score, researchers can institutively select the optimal set of genes from the highest-ranking ones. We applied the Wx algorithm to a TCGA pan-cancer gene-expression cohort to identify an optimal set of gene-expression biomarker candidates that can distinguish cancer samples from normal samples for 12 different types of cancer. The 14 gene-expression biomarker candidates identified by Wx were comparable to or outperformed previously reported universal gene expression biomarkers, highlighting the usefulness of the Wx algorithm for next-generation sequencing data. Thus, we anticipate that the Wx algorithm can complement current state-of-the-art analytical applications for the identification of biomarker candidates as an alternative method. The stand-alone and web versions of the Wx algorithm are available at https://github.com/deargen/DearWXpub and https://wx.deargendev.me/ , respectively.

Assuntos

Algoritmos , Genes Neoplásicos , Redes Neurais de Computação , Transcriptoma , Biomarcadores Tumorais , Conjuntos de Dados como Assunto , Perfilação da Expressão Gênica , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Neoplasias/genética , RNA Neoplásico/genética

RESUMO

RESUMO

RESUMO

Assuntos

RESUMO

RESUMO

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA