Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 246
Filtrar
1.
J Chem Inf Model ; 2024 Jun 28.
Artigo em Inglês | MEDLINE | ID: mdl-38940765

RESUMO

Computer-assisted synthesis planning has become increasingly important in drug discovery. While deep-learning models have shown remarkable progress in achieving high accuracies for single-step retrosynthetic predictions, their performances in retrosynthetic route planning need to be checked. This study compares the intricate single-step models with a straightforward template enumeration approach for retrosynthetic route planning on a real-world drug molecule data set. Despite the superior single-step accuracy of advanced models, the template enumeration method with a heuristic-based retrosynthesis knowledge score was found to surpass them in efficiency in searching the reaction space, achieving a higher or comparable solve rate within the same time frame. This counterintuitive result underscores the importance of efficiency and retrosynthesis knowledge in retrosynthesis route planning and suggests that future research should incorporate a simple template enumeration as a benchmark. It also suggests that this simple yet effective strategy should be considered alongside more complex models to better cater to the practical needs of computer-assisted synthesis planning in drug discovery.

2.
Chem Sci ; 15(21): 7926-7942, 2024 May 29.
Artigo em Inglês | MEDLINE | ID: mdl-38817560

RESUMO

Molecular docking, a key technique in structure-based drug design, plays pivotal roles in protein-ligand interaction modeling, hit identification and optimization, in which accurate prediction of protein-ligand binding mode is essential. Conventional docking approaches perform well in redocking tasks with known protein binding pocket conformation in the complex state. However, in real-world docking scenario without knowing the protein binding conformation for a new ligand, accurately modeling the binding complex structure remains challenging as flexible docking is computationally expensive and inaccurate. Typical deep learning-based docking methods do not explicitly consider protein side chain conformations and fail to ensure the physical plausibility and detailed atomic interactions. In this study, we present DiffBindFR, a full-atom diffusion-based flexible docking model that operates over the product space of ligand overall movements and flexibility and pocket side chain torsion changes. We show that DiffBindFR has higher accuracy in producing native-like binding structures with physically plausible and detailed interactions than available docking methods. Furthermore, in the Apo and AlphaFold2 modeled structures, DiffBindFR demonstrates superior advantages in accurate ligand binding pose and protein binding conformation prediction, making it suitable for Apo and AlphaFold2 structure-based drug design. DiffBindFR provides a powerful flexible docking tool for modeling accurate protein-ligand binding structures.

3.
Cell Chem Biol ; 31(3): 452-464.e10, 2024 Mar 21.
Artigo em Inglês | MEDLINE | ID: mdl-37913771

RESUMO

Various biological agents have been developed to target tumor necrosis factor alpha (TNF-α) and its receptor TNFR1 for the rheumatoid arthritis (RA) treatment, whereas small molecules modulating such cytokine receptors are rarely reported in comparison to the biologicals. Here, by revealing the mechanism of action of vinigrol, a diterpenoid natural product, we show that inhibition of the protein disulfide isomerase (PDI, PDIA1) by small molecules activates A disintegrin and metalloprotease 17 (ADAM17) and then leads to the TNFR1 shedding on mouse and human cell membranes. This small-molecule-induced receptor shedding not only effectively blocks the inflammatory response caused by TNF-α in cells, but also reduces the arthritic score and joint damage in the collagen-induced arthritis mouse model. Our study indicates that targeting the PDI-ADAM17 signaling module to regulate the shedding of cytokine receptors by the chemical approach constitutes a promising strategy for alleviating RA.


Assuntos
Artrite Reumatoide , Diterpenos , Camundongos , Humanos , Animais , Receptores Tipo I de Fatores de Necrose Tumoral/metabolismo , Fator de Necrose Tumoral alfa/metabolismo , Proteômica , Artrite Reumatoide/tratamento farmacológico , Proteína ADAM17
4.
Bioorg Med Chem Lett ; 97: 129547, 2024 01 01.
Artigo em Inglês | MEDLINE | ID: mdl-37944867

RESUMO

The COVID-19 caused by SARS-CoV-2 has led to a global pandemic that continues to impact societies and economies worldwide. The main protease (Mpro) plays a crucial role in SARS-CoV-2 replication and is an attractive target for anti-SARS-CoV-2 drug discovery. Herein, we report a series of 3-oxo-1,2,3,4-tetrahydropyrido[1,2-a]pyrazin derivatives as non-peptidomimetic inhibitors targeting SARS-CoV-2 Mpro through structure-based virtual screening and biological evaluation. Further similarity search and structure-activity relationship study led to the identification of compound M56-S2 with the enzymatic IC50 value of 4.0 µM. Moreover, the molecular simulation and predicted ADMET properties, indicated that non-peptidomimetic inhibitor M56-S2 might serve as a useful starting point for the further discovery of highly potent inhibitors targeting SARS-CoV-2 Mpro.


Assuntos
COVID-19 , Pirazinas , SARS-CoV-2 , Humanos , Antivirais/farmacologia , COVID-19/prevenção & controle , Simulação de Acoplamento Molecular , Simulação de Dinâmica Molecular , Inibidores de Proteases/farmacologia , SARS-CoV-2/efeitos dos fármacos , Proteínas não Estruturais Virais , Pirazinas/química , Pirazinas/farmacologia , Tratamento Farmacológico da COVID-19
5.
Molecules ; 28(17)2023 Sep 04.
Artigo em Inglês | MEDLINE | ID: mdl-37687259

RESUMO

Although loop epitopes at protein-protein binding interfaces often play key roles in mediating oligomer formation and interaction specificity, their binding sites are underexplored as drug targets owing to their high flexibility, relatively few hot spots, and solvent accessibility. Prior attempts to develop molecules that mimic loop epitopes to disrupt protein oligomers have had limited success. In this study, we used structure-based approaches to design and optimize cyclic-constrained peptides based on loop epitopes at the human phosphoglycerate dehydrogenase (PHGDH) dimer interface, which is an obligate homo-dimer with activity strongly dependent on the oligomeric state. The experimental validations showed that these cyclic peptides inhibit PHGDH activity by directly binding to the dimer interface and disrupting the obligate homo-oligomer formation. Our results demonstrate that loop epitope derived cyclic peptides with rationally designed affinity-enhancing substitutions can modulate obligate protein homo-oligomers, which can be used to design peptide inhibitors for other seemingly intractable oligomeric proteins.


Assuntos
Dermatite , Fosfoglicerato Desidrogenase , Humanos , Fosfoglicerato Desidrogenase/genética , Peptídeos Cíclicos/farmacologia , Sítios de Ligação , Epitopos , Polímeros
6.
Nat Commun ; 14(1): 5203, 2023 08 25.
Artigo em Inglês | MEDLINE | ID: mdl-37626077

RESUMO

Intrinsically disordered proteins (IDPs) play crucial roles in cellular processes and hold promise as drug targets. However, the dynamic nature of IDPs remains poorly understood. Here, we construct a single-molecule electrical nanocircuit based on silicon nanowire field-effect transistors (SiNW-FETs) and functionalize it with an individual disordered c-Myc bHLH-LZ domain to enable label-free, in situ, and long-term measurements at the single-molecule level. We use the device to study c-Myc interaction with Max and/or small molecule inhibitors. We observe the self-folding/unfolding process of c-Myc and reveal its interaction mechanism with Max and inhibitors through ultrasensitive real-time monitoring. We capture a relatively stable encounter intermediate ensemble of c-Myc during its transition from the unbound state to the fully folded state. The c-Myc/Max and c-Myc/inhibitor dissociation constants derived are consistent with other ensemble experiments. These proof-of-concept results provide an understanding of the IDP-binding/folding mechanism and represent a promising nanotechnology for IDP conformation/interaction studies and drug discovery.


Assuntos
Sistemas de Liberação de Medicamentos , Proteínas Intrinsicamente Desordenadas/química , Modelos Moleculares , Estrutura Terciária de Proteína , Proteínas Proto-Oncogênicas c-myc/antagonistas & inibidores , Proteínas Proto-Oncogênicas c-myc/química , Ligação Proteica
7.
Nat Commun ; 14(1): 3864, 2023 06 30.
Artigo em Inglês | MEDLINE | ID: mdl-37391417

RESUMO

The eukaryotic single-stranded DNA (ssDNA)-binding protein Replication Protein A (RPA) plays a crucial role in various DNA metabolic pathways, including DNA replication and repair, by dynamically associating with ssDNA. While the binding of a single RPA molecule to ssDNA has been thoroughly studied, the accessibility of ssDNA is largely governed by the bimolecular behavior of RPA, the biophysical nature of which remains unclear. In this study, we develop a three-step low-complexity ssDNA Curtains method, which, when combined with biochemical assays and a Markov chain model in non-equilibrium physics, allow us to decipher the dynamics of multiple RPA binding to long ssDNA. Interestingly, our results suggest that Rad52, the mediator protein, can modulate the ssDNA accessibility of Rad51, which is nucleated on RPA coated ssDNA through dynamic ssDNA exposure between neighboring RPA molecules. We find that this process is controlled by the shifting between the protection mode and action mode of RPA ssDNA binding, where tighter RPA spacing and lower ssDNA accessibility are favored under RPA protection mode, which can be facilitated by the Rfa2 WH domain and inhibited by Rad52 RPA interaction.


Assuntos
DNA de Cadeia Simples , Rad51 Recombinase , Proteína de Replicação A , DNA de Cadeia Simples/genética , Proteínas de Ligação a DNA/genética , Proteína de Replicação A/genética , Rad51 Recombinase/genética
8.
Bioinformatics ; 39(7)2023 07 01.
Artigo em Inglês | MEDLINE | ID: mdl-37369035

RESUMO

MOTIVATION: In recent years, high-throughput sequencing technologies have made large-scale protein sequences accessible. However, their functional annotations usually rely on low-throughput and pricey experimental studies. Computational prediction models offer a promising alternative to accelerate this process. Graph neural networks have shown significant progress in protein research, but capturing long-distance structural correlations and identifying key residues in protein graphs remains challenging. RESULTS: In the present study, we propose a novel deep learning model named Hierarchical graph transformEr with contrAstive Learning (HEAL) for protein function prediction. The core feature of HEAL is its ability to capture structural semantics using a hierarchical graph Transformer, which introduces a range of super-nodes mimicking functional motifs to interact with nodes in the protein graph. These semantic-aware super-node embeddings are then aggregated with varying emphasis to produce a graph representation. To optimize the network, we utilized graph contrastive learning as a regularization technique to maximize the similarity between different views of the graph representation. Evaluation of the PDBch test set shows that HEAL-PDB, trained on fewer data, achieves comparable performance to the recent state-of-the-art methods, such as DeepFRI. Moreover, HEAL, with the added benefit of unresolved protein structures predicted by AlphaFold2, outperforms DeepFRI by a significant margin on Fmax, AUPR, and Smin metrics on PDBch test set. Additionally, when there are no experimentally resolved structures available for the proteins of interest, HEAL can still achieve better performance on AFch test set than DeepFRI and DeepGOPlus by taking advantage of AlphaFold2 predicted structures. Finally, HEAL is capable of finding functional sites through class activation mapping. AVAILABILITY AND IMPLEMENTATION: Implementations of our HEAL can be found at https://github.com/ZhonghuiGu/HEAL.


Assuntos
Benchmarking , Sequenciamento de Nucleotídeos em Larga Escala , Sequência de Aminoácidos , Redes Neurais de Computação , Semântica
9.
J Mol Biol ; 435(14): 168141, 2023 07 15.
Artigo em Inglês | MEDLINE | ID: mdl-37356903

RESUMO

Ligand binding sites provide essential information for uncovering protein functions and structure-based drug discovery. To facilitate cavity detection and property analysis process, we developed a comprehensive web server, CavityPlus in 2018. CavityPlus applies the CAVITY program to detect potential binding sites in a given protein structure. The CavPharmer, CorrSite, and CovCys tools can then be applied to generate receptor-based pharmacophore models, identify potential allosteric sites, or detect druggable cysteine residues for covalent drug design. While CavityPlus has been widely used, the constantly evolving knowledge and methods make it necessary to improve and extend its functions. This study presents a new version of CavityPlus, CavityPlus 2022 through a series of upgrades. We upgraded the CAVITY tool to greatly speed up cavity detection calculation. We optimized the CavPharmer tool for fast speed and more accurate results. We integrated the newly developed CorrSite2.0 into the CavityPlus 2022 web server for its improved performance of allosteric site prediction. We also added a new CavityMatch module for drug repurposing and protein function studies by searching similar cavities to a given cavity from pre-constructed cavity databases. The new version of CavityPlus is freely available at http://pkumdl.cn:8000/cavityplus/.


Assuntos
Bases de Dados de Proteínas , Proteínas , Software , Sítio Alostérico , Sítios de Ligação , Internet , Ligantes , Conformação Proteica , Proteínas/química
10.
Acta Biochim Biophys Sin (Shanghai) ; 55(7): 1075-1083, 2023 Jun 12.
Artigo em Inglês | MEDLINE | ID: mdl-37294104

RESUMO

Biomolecular condensates formed by phase separation are involved in many cellular processes. Dysfunctional or abnormal condensates are closely associated with neurodegenerative diseases, cancer and other diseases. Small molecules can effectively regulate protein phase separation by modulating the formation, dissociation, size and material properties of condensates. Discovery of small molecules to regulate protein phase separation provides chemical probes for deciphering the underlying mechanism and potential novel treatments for condensate-related diseases. Here we review the advances of small molecule regulation of phase separation. The discovery, chemical structures of recently found small molecule phase separation regulators and how they modulate biological condensates are summarized and discussed. Possible strategies to accelerate the discovery of more liquid-liquid phase separation (LLPS)-regulating small molecules are proposed.

11.
ACS Cent Sci ; 9(5): 861-863, 2023 May 24.
Artigo em Inglês | MEDLINE | ID: mdl-37252366
12.
J Virol ; 97(4): e0182922, 2023 04 27.
Artigo em Inglês | MEDLINE | ID: mdl-36943056

RESUMO

Spring viremia of carp virus (SVCV) is a highly pathogenic Vesiculovirus infecting the common carp, yet neither a vaccine nor effective therapies are available to treat spring viremia of carp (SVC). Like all negative-sense viruses, SVCV contains an RNA genome that is encapsidated by the nucleoprotein (N) in the form of a ribonucleoprotein (RNP) complex, which serves as the template for viral replication and transcription. Here, the three-dimensional (3D) structure of SVCV RNP was resolved through cryo-electron microscopy (cryo-EM) at a resolution of 3.7 Å. RNP assembly was stabilized by N and C loops; RNA was wrapped in the groove between the N and C lobes with 9 nt nucleotide per protomer. Combined with mutational analysis, our results elucidated the mechanism of RNP formation. The RNA binding groove of SVCV N was used as a target for drug virtual screening, and it was found suramin had a good antiviral effect. This study provided insights into RNP assembly, and anti-SVCV drug screening was performed on the basis of this structure, providing a theoretical basis and efficient drug screening method for the prevention and treatment of SVC. IMPORTANCE Aquaculture accounts for about 70% of global aquatic products, and viral diseases severely harm the development of aquaculture industry. Spring viremia of carp virus (SVCV) is the pathogen causing highly contagious spring viremia of carp (SVC) disease in cyprinids, especially common carp (Cyprinus carpio), yet neither a vaccine nor effective therapies are available to treat this disease. In this study, we have elucidated the mechanism of SVCV ribonucleoprotein complex (RNP) formation by resolving the 3D structure of SVCV RNP and screened antiviral drugs based on the structure. It is found that suramin could competitively bind to the RNA binding groove and has good antiviral effects both in vivo and in vitro. Our study provides a template for rational drug discovery efforts to treat and prevent SVCV infections.


Assuntos
Modelos Moleculares , Rhabdoviridae , Ribonucleoproteínas , Proteínas Virais , Ribonucleoproteínas/química , Ribonucleoproteínas/metabolismo , Rhabdoviridae/química , Rhabdoviridae/efeitos dos fármacos , Proteínas Virais/química , Proteínas Virais/metabolismo , Estrutura Quaternária de Proteína , Antivirais/farmacologia , Avaliação Pré-Clínica de Medicamentos , Microscopia Crioeletrônica , Suramina/farmacologia
13.
J Chem Phys ; 158(10): 105102, 2023 Mar 14.
Artigo em Inglês | MEDLINE | ID: mdl-36922138

RESUMO

Allostery is an important regulatory mechanism of protein functions. Among allosteric proteins, certain protein structure types are more observed. However, how allosteric regulation depends on protein topology remains elusive. In this study, we extracted protein topology graphs at the fold level and found that known allosteric proteins mainly contain multiple domains or subunits and allosteric sites reside more often between two or more domains of the same fold type. Only a small fraction of fold-fold combinations are observed in allosteric proteins, and homo-fold-fold combinations dominate. These analyses imply that the locations of allosteric sites including cryptic ones depend on protein topology. We further developed TopoAlloSite, a novel method that uses the kernel support vector machine to predict the location of allosteric sites on the overall protein topology based on the subgraph-matching kernel. TopoAlloSite successfully predicted known cryptic allosteric sites in several allosteric proteins like phosphopantothenoylcysteine synthetase, spermidine synthase, and sirtuin 6, demonstrating its power in identifying cryptic allosteric sites without performing long molecular dynamics simulations or large-scale experimental screening. Our study demonstrates that protein topology largely determines how its function can be allosterically regulated, which can be used to find new druggable targets and locate potential binding sites for rational allosteric drug design.


Assuntos
Simulação de Dinâmica Molecular , Proteínas , Regulação Alostérica , Proteínas/química , Sítio Alostérico , Sítios de Ligação , Ligação Proteica
14.
Elife ; 122023 02 17.
Artigo em Inglês | MEDLINE | ID: mdl-36799896

RESUMO

Allostery is fundamental to many biological processes. Due to the distant regulation nature, how allosteric mutations, modifications, and effector binding impact protein function is difficult to forecast. In protein engineering, remote mutations cannot be rationally designed without large-scale experimental screening. Allosteric drugs have raised much attention due to their high specificity and possibility of overcoming existing drug-resistant mutations. However, optimization of allosteric compounds remains challenging. Here, we developed a novel computational method KeyAlloSite to predict allosteric site and to identify key allosteric residues (allo-residues) based on the evolutionary coupling model. We found that protein allosteric sites are strongly coupled to orthosteric site compared to non-functional sites. We further inferred key allo-residues by pairwise comparing the difference of evolutionary coupling scores of each residue in the allosteric pocket with the functional site. Our predicted key allo-residues are in accordance with previous experimental studies for typical allosteric proteins like BCR-ABL1, Tar, and PDZ3, as well as key cancer mutations. We also showed that KeyAlloSite can be used to predict key allosteric residues distant from the catalytic site that are important for enzyme catalysis. Our study demonstrates that weak coevolutionary couplings contain important information of protein allosteric regulation function. KeyAlloSite can be applied in studying the evolution of protein allosteric regulation, designing and optimizing allosteric drugs, and performing functional protein design and enzyme engineering.


Assuntos
Proteínas , Proteínas/metabolismo , Sítio Alostérico , Regulação Alostérica/genética , Domínio Catalítico
15.
Protein Sci ; 32(2): e4555, 2023 02.
Artigo em Inglês | MEDLINE | ID: mdl-36564866

RESUMO

The development of efficient computational methods for drug target protein identification can compensate for the high cost of experiments and is therefore of great significance for drug development. However, existing structure-based drug target protein-identification algorithms are limited by the insufficient number of proteins with experimentally resolved structures. Moreover, sequence-based algorithms cannot effectively extract information from protein sequences and thus display insufficient accuracy. Here, we combined the sequence-based self-supervised pretraining protein language model ESM1b with a graph convolutional neural network classifier to develop an improved, sequence-based drug target protein identification method. This complete model, named QuoteTarget, efficiently encodes proteins based on sequence information alone and achieves an accuracy of 95% with the nonredundant drug target and nondrug target datasets constructed for this study. When applied to all proteins from Homo sapiens, QuoteTarget identified 1213 potential undeveloped drug target proteins. We further inferred residue-binding weights from the well-trained network using the gradient-weighted class activation mapping (Grad-Cam) algorithm. Notably, we found that without any binding site information input, significant residues inferred by the model closely match the experimentally confirmed drug molecule-binding sites. Thus, our work provides a highly effective sequence-based identifier for drug target proteins, as well to yield new insights into recognizing drug molecule-binding sites. The entire model is available at https://github.com/Chenjxjx/drug-target-prediction.


Assuntos
Redes Neurais de Computação , Proteínas , Humanos , Proteínas/química , Algoritmos , Sítios de Ligação , Sequência de Aminoácidos
16.
Methods Mol Biol ; 2563: 215-223, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-36227475

RESUMO

Liquid-liquid phase separation (LLPS) often induces the formation of biomolecule condensates at the cellular level. The importance of this phenomenon has been demonstrated in many important biological functions, such as in transcription. However, the biophysical nature of LLPS containing transcriptional machinery has not yet been carefully examined. Here, we give an overview of a novel high-throughput single-molecule technique, termed as DNA Curtains. It was established recently to dissect the DNA compaction process in real time. The experimental procedures are further discussed in detail in the context of the biomolecular condensates of a transcription repressor.


Assuntos
DNA , Imagem Individual de Molécula
17.
Med Rev (2021) ; 3(6): 487-510, 2023 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-38282798

RESUMO

Proteins function as integral actors in essential life processes, rendering the realm of protein research a fundamental domain that possesses the potential to propel advancements in pharmaceuticals and disease investigation. Within the context of protein research, an imperious demand arises to uncover protein functionalities and untangle intricate mechanistic underpinnings. Due to the exorbitant costs and limited throughput inherent in experimental investigations, computational models offer a promising alternative to accelerate protein function annotation. In recent years, protein pre-training models have exhibited noteworthy advancement across multiple prediction tasks. This advancement highlights a notable prospect for effectively tackling the intricate downstream task associated with protein function prediction. In this review, we elucidate the historical evolution and research paradigms of computational methods for predicting protein function. Subsequently, we summarize the progress in protein and molecule representation as well as feature extraction techniques. Furthermore, we assess the performance of machine learning-based algorithms across various objectives in protein function prediction, thereby offering a comprehensive perspective on the progress within this field.

18.
Eur J Med Chem ; 244: 114803, 2022 Dec 15.
Artigo em Inglês | MEDLINE | ID: mdl-36209629

RESUMO

SARS-CoV-2 3CL protease is one of the key targets for drug development against COVID-19. Most known SARS-CoV-2 3CL protease inhibitors act by covalently binding to the active site cysteine. Yet, computational screens against this enzyme were mainly focused on non-covalent inhibitor discovery. Here, we developed a deep learning-based stepwise strategy for selective covalent inhibitor screen. We used a deep learning framework that integrated a directed message passing neural network with a feed-forward neural network to construct two different classifiers for either covalent or non-covalent inhibition activity prediction. These two classifiers were trained on the covalent and non-covalent 3CL protease inhibitors dataset, respectively, which achieved high prediction accuracy. We then successively applied the covalent inhibitor model and the non-covalent inhibitor model to screen a chemical library containing compounds with covalent warheads of cysteine. We experimentally tested the inhibition activity of 32 top-ranking compounds and 12 of them were active, among which 6 showed IC50 values less than 12 µM and the strongest one inhibited SARS-CoV-2 3CL protease with an IC50 of 1.4 µM. Further investigation demonstrated that 5 of the 6 active compounds showed typical covalent inhibition behavior with time-dependent activity. These new covalent inhibitors provide novel scaffolds for developing highly active SARS-CoV-2 3CL covalent inhibitors.


Assuntos
Tratamento Farmacológico da COVID-19 , Aprendizado Profundo , Humanos , SARS-CoV-2 , Inibidores de Proteases/farmacologia , Inibidores de Proteases/química , Proteases 3C de Coronavírus , Cisteína , Antivirais/farmacologia
19.
Protein Sci ; 31(12): e4484, 2022 12.
Artigo em Inglês | MEDLINE | ID: mdl-36309961

RESUMO

Atomic interactions play essential roles in protein folding, structure stabilization, and function performance. Recent advances in deep learning-based methods have achieved impressive success not only in protein structure prediction, but also in protein sequence design. However, highly efficient and accurate protein side-chain prediction methods that can give detailed atomic interactions are still lacking. In the present study, we developed a deep learning based method, GeoPacker, that uses geometric deep learning coupled ResNet for protein side-chain modeling. GeoPacker explicitly represents atomic interactions with rotational and translational invariance for information extraction of relative locations. GeoPacker outperformed the state-of-the-art energy function-based methods in side-chain structure prediction accuracy and runs about 10 and 700 times faster than the deep learning-based method DLPacker and OPUS-rota4 with comparable prediction accuracy, respectively. The performance of GeoPacker does not depend on the secondary structures that the residues belong to. GeoPacker gives highly accurate predictions for buried residues in the protein core as well as protein-protein interface, making it a useful tool for protein structure modeling, protein, and interaction design.


Assuntos
Aprendizado Profundo , Algoritmos , Proteínas/química , Estrutura Secundária de Proteína , Sequência de Aminoácidos , Conformação Proteica
20.
J Chem Inf Model ; 62(22): 5321-5328, 2022 11 28.
Artigo em Inglês | MEDLINE | ID: mdl-36108142

RESUMO

Molecular structures are commonly depicted in 2D printed forms in scientific documents such as journal papers and patents. However, these 2D depictions are not machine readable. Due to a backlog of decades and an increasing amount of printed literatures, there is a high demand for translating printed depictions into machine-readable formats, which is known as Optical Chemical Structure Recognition (OCSR). Most OCSR systems developed over the last three decades use a rule-based approach, which vectorizes the depiction based on the interpretation of vectors and nodes as bonds and atoms. Here, we present a practical software called MolMiner, which is primarily built using deep neural networks originally developed for semantic segmentation and object detection to recognize atom and bond elements from documents. These recognized elements can be easily connected as a molecular graph with a distance-based construction algorithm. MolMiner gave state-of-the-art performance on four benchmark data sets and a self-collected external data set from scientific papers. As MolMiner performed similarly well in real-world OCSR tasks with a user-friendly interface, it is a useful and valuable tool for daily applications. The free download links of Mac and Windows versions are available at https://github.com/iipharma/pharmamind-molminer.


Assuntos
Algoritmos , Software , Estrutura Molecular , Redes Neurais de Computação
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...