Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 88
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
ACS Chem Biol ; 2024 Jun 27.
Artigo em Inglês | MEDLINE | ID: mdl-38934237

RESUMO

TRIM7 is a ubiquitin E3 ligase with key regulatory functions, mediating viral infection, tumor biology, innate immunity, and cellular processes, such as autophagy and ferroptosis. It contains a PRYSPRY domain that specifically recognizes degron sequences containing C-terminal glutamine. Ligands that bind to the TRIM7 PRYSPRY domain may have applications in the treatment of viral infections, as modulators of inflammation, and in the design of a new class of PROTACs (PROteolysis TArgeting Chimeras) that mediate the selective degradation of therapeutically relevant proteins (POIs). Here, we developed an assay toolbox for the comprehensive evaluation of TRIM7 ligands. Using TRIM7 degron sequences together with a structure-based design, we developed the first series of peptidomimetic ligands with low micromolar affinity. The terminal carboxylate moiety was required for ligand activity but prevented cell penetration. A prodrug strategy using an ethyl ester resulted in enhanced permeability, which was evaluated using confocal imaging.

2.
J Med Chem ; 2024 Jun 25.
Artigo em Inglês | MEDLINE | ID: mdl-38917049

RESUMO

G protein-coupled receptor G2A was postulated to be a promising target for the development of new therapeutics in neuropathic pain, acute myeloid leukemia, and inflammation. However, there is still a lack of potent, selective, and drug-like G2A agonists to be used as a chemical tool or as the starting matter for the development of drugs. In this work, we present the discovery and structure-activity relationship elucidation of a new potent and selective G2A agonist scaffold. Systematic optimization resulted in (3-(pyridin-3-ylmethoxy)benzoyl)-d-phenylalanine (T-10418) exhibiting higher potency than the reference and natural ligand 9-HODE and high selectivity among G protein-coupled receptors. With its favorable activity, a clean selectivity profile, excellent solubility, and high metabolic stability, T-10418 qualifies as a pharmacological tool to investigate the effects of G2A activation.

3.
Science ; 384(6702): eadn6354, 2024 Jun 21.
Artigo em Inglês | MEDLINE | ID: mdl-38753765

RESUMO

AlphaFold2 (AF2) models have had wide impact but mixed success in retrospective ligand recognition. We prospectively docked large libraries against unrefined AF2 models of the σ2 and serotonin 2A (5-HT2A) receptors, testing hundreds of new molecules and comparing results with those obtained from docking against the experimental structures. Hit rates were high and similar for the experimental and AF2 structures, as were affinities. Success in docking against the AF2 models was achieved despite differences between orthosteric residue conformations in the AF2 models and the experimental structures. Determination of the cryo-electron microscopy structure for one of the more potent 5-HT2A ligands from the AF2 docking revealed residue accommodations that resembled the AF2 prediction. AF2 models may sample conformations that differ from experimental structures but remain low energy and relevant for ligand discovery, extending the domain of structure-based drug design.


Assuntos
Aprendizado Profundo , Descoberta de Drogas , Simulação de Acoplamento Molecular , Receptor 5-HT2A de Serotonina , Agonistas do Receptor 5-HT2 de Serotonina , Antagonistas do Receptor 5-HT2 de Serotonina , Humanos , Microscopia Crioeletrônica , Desenho de Fármacos , Descoberta de Drogas/métodos , Ligantes , Conformação Proteica , Dobramento de Proteína , Receptor 5-HT2A de Serotonina/química , Receptor 5-HT2A de Serotonina/ultraestrutura , Receptores sigma/química , Receptores sigma/metabolismo , Bibliotecas de Moléculas Pequenas/química , Agonistas do Receptor 5-HT2 de Serotonina/química , Agonistas do Receptor 5-HT2 de Serotonina/farmacologia , Antagonistas do Receptor 5-HT2 de Serotonina/química , Antagonistas do Receptor 5-HT2 de Serotonina/farmacologia
4.
Cell ; 2024 May 23.
Artigo em Inglês | MEDLINE | ID: mdl-38810646

RESUMO

The cystic fibrosis transmembrane conductance regulator (CFTR) is a crucial ion channel whose loss of function leads to cystic fibrosis, whereas its hyperactivation leads to secretory diarrhea. Small molecules that improve CFTR folding (correctors) or function (potentiators) are clinically available. However, the only potentiator, ivacaftor, has suboptimal pharmacokinetics and inhibitors have yet to be clinically developed. Here, we combine molecular docking, electrophysiology, cryo-EM, and medicinal chemistry to identify CFTR modulators. We docked ∼155 million molecules into the potentiator site on CFTR, synthesized 53 test ligands, and used structure-based optimization to identify candidate modulators. This approach uncovered mid-nanomolar potentiators, as well as inhibitors, that bind to the same allosteric site. These molecules represent potential leads for the development of more effective drugs for cystic fibrosis and secretory diarrhea, demonstrating the feasibility of large-scale docking for ion channel drug discovery.

5.
bioRxiv ; 2024 Feb 28.
Artigo em Inglês | MEDLINE | ID: mdl-38328157

RESUMO

Large library docking can reveal unexpected chemotypes that complement the structures of biological targets. Seeking new agonists for the cannabinoid-1 receptor (CB1R), we docked 74 million tangible molecules, prioritizing 46 high ranking ones for de novo synthesis and testing. Nine were active by radioligand competition, a 20% hit-rate. Structure-based optimization of one of the most potent of these (Ki = 0.7 uM) led to '4042, a 1.9 nM ligand and a full CB1R agonist. A cryo-EM structure of the purified enantiomer of '4042 ('1350) in complex with CB1R-Gi1 confirmed its docked pose. The new agonist was strongly analgesic, with generally a 5-10-fold therapeutic window over sedation and catalepsy and no observable conditioned place preference. These findings suggest that new cannabinoid chemotypes may disentangle characteristic cannabinoid side-effects from their analgesia, supporting the further development of cannabinoids as pain therapeutics.

6.
J Org Chem ; 2024 Feb 21.
Artigo em Inglês | MEDLINE | ID: mdl-38383160

RESUMO

The chemoselectivity of halo(het)arene sulfonyl halide aminations is studied thoroughly under parallel synthesis conditions, and the scope and limitations of the method are established. It is shown that SNAr-reactive sulfonyl halides typically undergo sulfonamide synthesis during the first step; the second amination is also possible provided that the SNAr-active center is sufficiently reactive. On the contrary, sulfonyl fluorides bearing an arylating moiety undergo selective transformation at the latter reactive center under proper control. Further sulfur-fluoride exchange (SuFEx) is also possible, which can be especially valuable for some sulfonyl halide classes. The developed two-step parallel double amination protocol provides access to a 6.67-billion compound synthetically tractable REAL-type chemical space (76% expected synthesis success rate).

7.
J Chem Inf Model ; 64(5): 1704-1718, 2024 Mar 11.
Artigo em Inglês | MEDLINE | ID: mdl-38411104

RESUMO

The proline biosynthetic enzyme Δ1-pyrroline-5-carboxylate (P5C) reductase 1 (PYCR1) is one of the most consistently upregulated enzymes across multiple cancer types and central to the metabolic rewiring of cancer cells. Herein, we describe a fragment-based, structure-first approach to the discovery of PYCR1 inhibitors. Thirty-seven fragment-like carboxylic acids in the molecular weight range of 143-289 Da were selected from docking and then screened using X-ray crystallography as the primary assay. Strong electron density was observed for eight compounds, corresponding to a crystallographic hit rate of 22%. The fragments are novel compared to existing proline analog inhibitors in that they block both the P5C substrate pocket and the NAD(P)H binding site. Four hits showed inhibition of PYCR1 in kinetic assays, and one has lower apparent IC50 than the current best proline analog inhibitor. These results show proof-of-concept for our inhibitor discovery approach and provide a basis for fragment-to-lead optimization.


Assuntos
Pirrolina Carboxilato Redutases , delta-1-Pirrolina-5-Carboxilato Redutase , Pirrolina Carboxilato Redutases/química , Pirrolina Carboxilato Redutases/metabolismo , Cristalografia por Raios X , Sítios de Ligação , Prolina
8.
bioRxiv ; 2024 Mar 13.
Artigo em Inglês | MEDLINE | ID: mdl-38187536

RESUMO

AlphaFold2 (AF2) and RosettaFold have greatly expanded the number of structures available for structure-based ligand discovery, even though retrospective studies have cast doubt on their direct usefulness for that goal. Here, we tested unrefined AF2 models prospectively, comparing experimental hit-rates and affinities from large library docking against AF2 models vs the same screens targeting experimental structures of the same receptors. In retrospective docking screens against the σ2 and the 5-HT2A receptors, the AF2 structures struggled to recapitulate ligands that we had previously found docking against the receptors' experimental structures, consistent with published results. Prospective large library docking against the AF2 models, however, yielded similar hit rates for both receptors versus docking against experimentally-derived structures; hundreds of molecules were prioritized and tested against each model and each structure of each receptor. The success of the AF2 models was achieved despite differences in orthosteric pocket residue conformations for both targets versus the experimental structures. Intriguingly, against the 5-HT2A receptor the most potent, subtype-selective agonists were discovered via docking against the AF2 model, not the experimental structure. To understand this from a molecular perspective, a cryoEM structure was determined for one of the more potent and selective ligands to emerge from docking against the AF2 model of the 5-HT2A receptor. Our findings suggest that AF2 models may sample conformations that are relevant for ligand discovery, much extending the domain of applicability of structure-based ligand discovery.

9.
bioRxiv ; 2024 Mar 06.
Artigo em Inglês | MEDLINE | ID: mdl-38234749

RESUMO

Drugs acting as positive allosteric modulators (PAMs) to enhance the activation of the calcium sensing receptor (CaSR) and to suppress parathyroid hormone (PTH) secretion can treat hyperparathyroidism but suffer from side effects including hypocalcemia and arrhythmias. Seeking new CaSR modulators, we docked libraries of 2.7 million and 1.2 billion molecules against transforming pockets in the active-state receptor dimer structure. Consistent with simulations suggesting that docking improves with library size, billion-molecule docking found new PAMs with a hit rate that was 2.7-fold higher than the million-molecule library and with hits up to 37-fold more potent. Structure-based optimization of ligands from both campaigns led to nanomolar leads, one of which was advanced to animal testing. This PAM displays 100-fold the potency of the standard of care, cinacalcet, in ex vivo organ assays, and reduces serum PTH levels in mice by up to 80% without the hypocalcemia typical of CaSR drugs. Cryo-EM structures with the new PAMs show that they induce residue rearrangements in the binding pockets and promote CaSR dimer conformations that are closer to the G-protein coupled state compared to established drugs. These findings highlight the promise of large library docking for therapeutic leads, especially when combined with experimental structure determination and mechanism.

10.
bioRxiv ; 2024 Mar 11.
Artigo em Inglês | MEDLINE | ID: mdl-37745391

RESUMO

The cystic fibrosis transmembrane conductance regulator (CFTR) is a crucial ion channel whose loss of function leads to cystic fibrosis, while its hyperactivation leads to secretory diarrhea. Small molecules that improve CFTR folding (correctors) or function (potentiators) are clinically available. However, the only potentiator, ivacaftor, has suboptimal pharmacokinetics and inhibitors have yet to be clinically developed. Here we combine molecular docking, electrophysiology, cryo-EM, and medicinal chemistry to identify novel CFTR modulators. We docked ~155 million molecules into the potentiator site on CFTR, synthesized 53 test ligands, and used structure-based optimization to identify candidate modulators. This approach uncovered novel mid-nanomolar potentiators as well as inhibitors that bind to the same allosteric site. These molecules represent potential leads for the development of more effective drugs for cystic fibrosis and secretory diarrhea, demonstrating the feasibility of large-scale docking for ion channel drug discovery.

11.
Nat Commun ; 14(1): 8067, 2023 Dec 06.
Artigo em Inglês | MEDLINE | ID: mdl-38057319

RESUMO

The lipid prostaglandin E2 (PGE2) mediates inflammatory pain by activating G protein-coupled receptors, including the prostaglandin E2 receptor 4 (EP4R). Nonsteroidal anti-inflammatory drugs (NSAIDs) reduce nociception by inhibiting prostaglandin synthesis, however, the disruption of upstream prostanoid biosynthesis can lead to pleiotropic effects including gastrointestinal bleeding and cardiac complications. In contrast, by acting downstream, EP4R antagonists may act specifically as anti-inflammatory agents and, to date, no selective EP4R antagonists have been approved for human use. In this work, seeking to diversify EP4R antagonist scaffolds, we computationally dock over 400 million compounds against an EP4R crystal structure and experimentally validate 71 highly ranked, de novo synthesized molecules. Further, we show how structure-based optimization of initial docking hits identifies a potent and selective antagonist with 16 nanomolar potency. Finally, we demonstrate favorable pharmacokinetics for the discovered compound as well as anti-allodynic and anti-inflammatory activity in several preclinical pain models in mice.


Assuntos
Dinoprostona , Receptores de Prostaglandina , Humanos , Camundongos , Animais , Fagocitose , Anti-Inflamatórios/farmacologia , Anti-Inflamatórios/uso terapêutico , Dor/tratamento farmacológico , Anti-Inflamatórios não Esteroides/farmacologia
12.
Chem Sci ; 14(39): 10835-10846, 2023 Oct 11.
Artigo em Inglês | MEDLINE | ID: mdl-37829036

RESUMO

Accurate prediction of reaction yield is the holy grail for computer-assisted synthesis prediction, but current models have failed to generalize to large literature datasets. To understand the causes and inspire future design, we systematically benchmarked the yield prediction task. We carefully curated and augmented a literature dataset of 41 239 amide coupling reactions, each with information on reactants, products, intermediates, yields, and reaction contexts, and provided 3D structures for the molecules. We calculated molecular features related to 2D and 3D structure information, as well as physical and electronic properties. These descriptors were paired with 4 categories of machine learning methods (linear, kernel, ensemble, and neural network), yielding valuable benchmarks about feature and model performance. Despite the excellent performance on a high-throughput experiment (HTE) dataset (R2 around 0.9), no method gave satisfactory results on the literature data. The best performance was an R2 of 0.395 ± 0.020 using the stack technique. Error analysis revealed that reactivity cliff and yield uncertainty are among the main reasons for incorrect predictions. Removing reactivity cliffs and uncertain reactions boosted the R2 to 0.457 ± 0.006. These results highlight that yield prediction models must be sensitive to the reactivity change due to the subtle structure variance, as well as be robust to the uncertainty associated with yield measurements.

13.
Nat Chem ; 15(11): 1616-1625, 2023 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-37460812

RESUMO

Advances in chemoproteomic technology have revealed covalent interactions between small molecules and protein nucleophiles, primarily cysteine, on a proteome-wide scale. Most chemoproteomic screening approaches are indirect, relying on competition between electrophilic fragments and a minimalist electrophilic probe with inherently limited proteome coverage. Here we develop a chemoproteomic platform for direct electrophile-site identification based on enantiomeric pairs of clickable arylsulfonyl fluoride probes. Using stereoselective site modification as a proxy for ligandability in intact cells, we identify 634 tyrosines and lysines within functionally diverse protein sites, liganded by structurally diverse probes. Among multiple validated sites, we discover a chiral probe that modifies Y228 in the MYC binding site of the epigenetic regulator WDR5, as revealed by a high-resolution crystal structure. A distinct chiral probe stimulates tumour cell phagocytosis by covalently modifying Y387 in the recently discovered immuno-oncology target APMAP. Our work provides a deep resource of ligandable tyrosines and lysines for the development of covalent chemical probes.


Assuntos
Lisina , Proteoma , Lisina/química , Proteoma/química , Tirosina , Sítios de Ligação
14.
J Med Chem ; 66(15): 10241-10251, 2023 08 10.
Artigo em Inglês | MEDLINE | ID: mdl-37499195

RESUMO

The discovery of new scaffolds and chemotypes via high-throughput screening is tedious and resource intensive. Yet, there are millions of small molecules commercially available, rendering comprehensive in vitro tests intractable. We show how smart algorithms reduce large screening collections to target-specific sets of just a few hundred small molecules, allowing for a much faster and more cost-effective hit discovery process. We showcase the application of this virtual screening strategy by preselecting 434 compounds for Sirtuin-1 inhibition from a library of 2.6 million compounds, corresponding to 0.02% of the original library. Multistage in vitro validation ultimately confirmed nine chemically novel inhibitors. When compared to a competitive benchmark study for Sirtuin-1, our method shows a 12-fold higher hit rate. The results demonstrate how AI-driven preselection from large screening libraries allows for a massive reduction in the number of small molecules to be tested in vitro while still retaining a large number of hits.


Assuntos
Sirtuínas , Bibliotecas de Moléculas Pequenas , Bibliotecas de Moléculas Pequenas/farmacologia , Bibliotecas de Moléculas Pequenas/química , Ensaios de Triagem em Larga Escala , Algoritmos , Inteligência Artificial
15.
Protein Sci ; 32(8): e4712, 2023 08.
Artigo em Inglês | MEDLINE | ID: mdl-37354015

RESUMO

Antiviral therapeutics to treat SARS-CoV-2 are needed to diminish the morbidity of the ongoing COVID-19 pandemic. A well-precedented drug target is the main viral protease (MPro ), which is targeted by an approved drug and by several investigational drugs. Emerging viral resistance has made new inhibitor chemotypes more pressing. Adopting a structure-based approach, we docked 1.2 billion non-covalent lead-like molecules and a new library of 6.5 million electrophiles against the enzyme structure. From these, 29 non-covalent and 11 covalent inhibitors were identified in 37 series, the most potent having an IC50 of 29 and 20 µM, respectively. Several series were optimized, resulting in low micromolar inhibitors. Subsequent crystallography confirmed the docking predicted binding modes and may template further optimization. While the new chemotypes may aid further optimization of MPro inhibitors for SARS-CoV-2, the modest success rate also reveals weaknesses in our approach for challenging targets like MPro versus other targets where it has been more successful, and versus other structure-based techniques against MPro itself.


Assuntos
COVID-19 , Humanos , SARS-CoV-2/metabolismo , Pandemias , Inibidores de Proteases/farmacologia , Inibidores de Proteases/química , Simulação de Acoplamento Molecular , Proteínas não Estruturais Virais/química , Antivirais/farmacologia , Antivirais/química
16.
J Med Chem ; 66(12): 7785-7803, 2023 06 22.
Artigo em Inglês | MEDLINE | ID: mdl-37294077

RESUMO

An under-explored target for SARS-CoV-2 is the S-adenosyl methionine (SAM)-dependent methyltransferase Nsp14, which methylates the N7-guanosine of viral RNA at the 5'-end, allowing the virus to evade host immune response. We sought new Nsp14 inhibitors with three large library docking strategies. First, up to 1.1 billion lead-like molecules were docked against the enzyme's SAM site, leading to three inhibitors with IC50 values from 6 to 50 µM. Second, docking a library of 16 million fragments revealed 9 new inhibitors with IC50 values from 12 to 341 µM. Third, docking a library of 25 million electrophiles to covalently modify Cys387 revealed 7 inhibitors with IC50 values from 3.5 to 39 µM. Overall, 32 inhibitors encompassing 11 chemotypes had IC50 values < 50 µM and 5 inhibitors in 4 chemotypes had IC50 values < 10 µM. These molecules are among the first non-SAM-like inhibitors of Nsp14, providing starting points for future optimization.


Assuntos
COVID-19 , Metiltransferases , Humanos , SARS-CoV-2/genética , Proteínas não Estruturais Virais/genética , RNA Viral/genética , Exorribonucleases
17.
J Med Chem ; 66(11): 7355-7373, 2023 06 08.
Artigo em Inglês | MEDLINE | ID: mdl-37172324

RESUMO

Retinoic acid receptor-related orphan receptor γt (RORγt) is a nuclear receptor that is expressed in a variety of tissues and is a potential drug target for the treatment of inflammatory and auto-immune diseases, metabolic diseases, and resistant cancer types. We herein report the discovery of 2,3 derivatives of 4,5,6,7-tetrahydro-benzothiophene modulators of RORγt. We also report the solubility in acidic/neutral pH, mouse/human/dog/rat microsomal stability, Caco-2, and MDR1-MDCKII permeabilities of a set of these derivatives. For this group of modulators, inverse agonism by steric clashes and push-pull mechanisms induce greater instability to protein conformation compared to agonist lock hydration. Independent of the two mechanisms, we observed a basal modulatory activity of the tested 2,3 derivatives of 4,5,6,7-tetrahydro-benzothiophene toward RORγt due to the interactions with the Cys320-Glu326 and Arg364-Phe377 hydrophilic regions. The drug discovery approach reported in the current study can be employed to discover modulators of nuclear receptors and other globular protein targets.


Assuntos
Membro 3 do Grupo F da Subfamília 1 de Receptores Nucleares , Receptores do Ácido Retinoico , Camundongos , Ratos , Animais , Humanos , Cães , Membro 3 do Grupo F da Subfamília 1 de Receptores Nucleares/agonistas , Agonismo Inverso de Drogas , Células CACO-2
18.
J Chem Inf Model ; 63(4): 1166-1176, 2023 02 27.
Artigo em Inglês | MEDLINE | ID: mdl-36790087

RESUMO

Purchasable chemical space has grown rapidly into the tens of billions of molecules, providing unprecedented opportunities for ligand discovery but straining the tools that might exploit these molecules at scale. We have therefore developed ZINC-22, a database of commercially accessible small molecules derived from multi-billion-scale make-on-demand libraries. The new database and tools enable analog searching in this vast new space via a facile GUI, CartBlanche, drawing on similarity methods that scale sublinearly in the number of molecules. The new library also uses data organization methods, enabling rapid lookup of molecules and their physical properties, including conformations, partial atomic charges, c Log P values, and solvation energies, all crucial for molecule docking, which had become slow with older database organizations in previous versions of ZINC. As the libraries have continued to grow, we have been interested in finding whether molecular diversity has suffered, for instance, because certain scaffolds have come to dominate via easy analoging. This has not occurred thus far, and chemical diversity continues to grow with database size, with a log increase in Bemis-Murcko scaffolds for every two-log unit increase in database size. Most new scaffolds come from compounds with the highest heavy atom count. Finally, we consider the implications for databases like ZINC as the libraries grow toward and beyond the trillion-molecule range. ZINC is freely available to everyone and may be accessed at cartblanche22.docking.org, via Globus, and in the Amazon AWS and Oracle OCI clouds.


Assuntos
Zinco , Ligantes , Bases de Dados Factuais , Conformação Molecular , Simulação de Acoplamento Molecular
19.
Proc Natl Acad Sci U S A ; 120(2): e2212931120, 2023 01 10.
Artigo em Inglês | MEDLINE | ID: mdl-36598939

RESUMO

The nonstructural protein 3 (NSP3) of the severe acute respiratory syndrome-coronavirus-2 (SARS-CoV-2) contains a conserved macrodomain enzyme (Mac1) that is critical for pathogenesis and lethality. While small-molecule inhibitors of Mac1 have great therapeutic potential, at the outset of the COVID-19 pandemic, there were no well-validated inhibitors for this protein nor, indeed, the macrodomain enzyme family, making this target a pharmacological orphan. Here, we report the structure-based discovery and development of several different chemical scaffolds exhibiting low- to sub-micromolar affinity for Mac1 through iterations of computer-aided design, structural characterization by ultra-high-resolution protein crystallography, and binding evaluation. Potent scaffolds were designed with in silico fragment linkage and by ultra-large library docking of over 450 million molecules. Both techniques leverage the computational exploration of tangible chemical space and are applicable to other pharmacological orphans. Overall, 160 ligands in 119 different scaffolds were discovered, and 153 Mac1-ligand complex crystal structures were determined, typically to 1 Å resolution or better. Our analyses discovered selective and cell-permeable molecules, unexpected ligand-mediated conformational changes within the active site, and key inhibitor motifs that will template future drug development against Mac1.


Assuntos
COVID-19 , SARS-CoV-2 , Humanos , Cristalografia , Pandemias , Ligantes , Simulação de Acoplamento Molecular , Inibidores de Proteases/farmacologia , Antivirais/farmacologia , Antivirais/química
20.
J Comput Chem ; 44(2): 76-92, 2023 01 15.
Artigo em Inglês | MEDLINE | ID: mdl-36264601

RESUMO

Chemical yield is the percentage of the reactants converted to the desired products. Chemists use predictive algorithms to select high-yielding reactions and score synthesis routes, saving time and reagents. This study suggests a novel graph neural network architecture for chemical yield prediction. The network combines structural information about participants of the transformation as well as molecular and reaction-level descriptors. It works with incomplete chemical reactions and generates reactants-product atom mapping. We show that the network benefits from advanced information by comparing it with several machine learning models and molecular representations. Models included logistic regression, support vector machine, CatBoost, and Bidirectional Encoder Representations from Transformers. Molecular representations included extended-connectivity fingerprints, Morgan fingerprints, SMILESVec embeddings, and textual. Classification and regression objectives were assessed for each model and feature set. The goal of each classification model was to separate zero- and non-zero-yielding reactions. The models were trained and evaluated on a proprietary dataset of 10 reaction types. Also, the models were benchmarked on two public single reaction type datasets. The study was supplemented with analysis of data, results, and errors, as well as the impact of steric factors, side reactions, isolation, and purification efficiency. The supplementary code is available at https://github.com/SoftServeInc/yield-paper.


Assuntos
Algoritmos , Redes Neurais de Computação , Humanos , Aprendizado de Máquina , Máquina de Vetores de Suporte
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...