Pesquisa | Portal Regional da BVS

SELFIES and the future of molecular string representations.

Krenn, Mario; Ai, Qianxiang; Barthel, Senja; Carson, Nessa; Frei, Angelo; Frey, Nathan C; Friederich, Pascal; Gaudin, Théophile; Gayle, Alberto Alexander; Jablonka, Kevin Maik; Lameiro, Rafael F; Lemm, Dominik; Lo, Alston; Moosavi, Seyed Mohamad; Nápoles-Duarte, José Manuel; Nigam, AkshatKumar; Pollice, Robert; Rajan, Kohulan; Schatzschneider, Ulrich; Schwaller, Philippe; Skreta, Marta; Smit, Berend; Strieth-Kalthoff, Felix; Sun, Chong; Tom, Gary; Falk von Rudorff, Guido; Wang, Andrew; White, Andrew D; Young, Adamo; Yu, Rose; Aspuru-Guzik, Alán.

Patterns (N Y) ; 3(10): 100588, 2022 Oct 14.

Artigo em Inglês | MEDLINE | ID: mdl-36277819

RESUMO

Artificial intelligence (AI) and machine learning (ML) are expanding in popularity for broad applications to challenging tasks in chemistry and materials science. Examples include the prediction of properties, the discovery of new reaction pathways, or the design of new molecules. The machine needs to read and write fluently in a chemical language for each of these tasks. Strings are a common tool to represent molecular graphs, and the most popular molecular string representation, Smiles, has powered cheminformatics since the late 1980s. However, in the context of AI and ML in chemistry, Smiles has several shortcomings-most pertinently, most combinations of symbols lead to invalid results with no valid chemical interpretation. To overcome this issue, a new language for molecules was introduced in 2020 that guarantees 100% robustness: SELF-referencing embedded string (Selfies). Selfies has since simplified and enabled numerous new applications in chemistry. In this perspective, we look to the future and discuss molecular string representations, along with their respective opportunities and challenges. We propose 16 concrete future projects for robust molecular representations. These involve the extension toward new chemical domains, exciting questions at the interface of AI and robust languages, and interpretability for both humans and machines. We hope that these proposals will inspire several follow-up works exploiting the full potential of molecular string representations for the future of AI in chemistry and materials science.

Secondary analysis of electronic health records in critical care medicine.

Van Poucke, Sven; Gayle, Alberto Alexander; Vukicevic, Milan.

Ann Transl Med ; 6(3): 52, 2018 Feb.

Artigo em Inglês | MEDLINE | ID: mdl-29610744

Evaluating the lexico-grammatical differences in the writing of native and non-native speakers of English in peer-reviewed medical journals in the field of pediatric oncology: Creation of the genuine index scoring system.

Gayle, Alberto Alexander; Shimaoka, Motomu.

PLoS One ; 12(2): e0172338, 2017.

Artigo em Inglês | MEDLINE | ID: mdl-28212419

RESUMO

INTRODUCTION: The predominance of English in scientific research has created hurdles for "non-native speakers" of English. Here we present a novel application of native language identification (NLI) for the assessment of medical-scientific writing. For this purpose, we created a novel classification system whereby scoring would be based solely on text features found to be distinctive among native English speakers (NS) within a given context. We dubbed this the "Genuine Index" (GI). METHODOLOGY: This methodology was validated using a small set of journals in the field of pediatric oncology. Our dataset consisted of 5,907 abstracts, representing work from 77 countries. A support vector machine (SVM) was used to generate our model and for scoring. RESULTS: Accuracy, precision, and recall of the classification model were 93.3%, 93.7%, and 99.4%, respectively. Class specific F-scores were 96.5% for NS and 39.8% for our benchmark class, Japan. Overall kappa was calculated to be 37.2%. We found significant differences between countries with respect to the GI score. Significant correlation was found between GI scores and two validated objective measures of writing proficiency and readability. Two sets of key terms and phrases differentiating NS and non-native writing were identified. CONCLUSIONS: Our GI model was able to detect, with a high degree of reliability, subtle differences between the terms and phrasing used by native and non-native speakers in peer reviewed journals, in the field of pediatric oncology. In addition, L1 language transfer was found to be very likely to survive revision, especially in non-Western countries such as Japan. These findings show that even when the language used is technically correct, there may still be some phrasing or usage that impact quality.

Assuntos

Indexação e Redação de Resumos , Idioma , Oncologia , Escrita Médica , Pediatria , Revisão por Pares , Publicações Periódicas como Assunto , Fonética

Comparing baseline characteristics between groups: an introduction to the CBCgrps package.

Zhang, Zhongheng; Gayle, Alberto Alexander; Wang, Juan; Zhang, Haoyang; Cardinal-Fernández, Pablo.

Ann Transl Med ; 5(24): 484, 2017 Dec.

Artigo em Inglês | MEDLINE | ID: mdl-29299446

RESUMO

A usual practice in observational studies is the comparison of baseline characteristics of participants between study groups. The overall population can be grouped by clinical outcome or exposure status. A combined table reporting baseline characteristics is usually displayed, for the overall population and then separately for each group. The last column usually gives the P value for the comparison between study groups. In the conventional research model, the variables for which data are collected are limited in number. It is thus feasible to calculate descriptive data one by one and to manually create the table. The availability of EHR and big data mining techniques makes it possible to explore a far larger number of variables. However, manual tabulation of big data is particularly error prone; it is exceedingly time-consuming to create and revise such tables manually. In this paper, we introduce an R package called CBCgrps, which is designed to automate and streamline the generation of such tables when working with big data. The package contains two functions, twogrps() and multigrps(), which are used for comparisons between two and multiple groups, respectively.

Describing the factors that influence the process of making a shared-agenda in Japanese family physician consultations: a qualitative study.

Goto, Michiko; Yokoya, Shoji; Takemura, Yousuke; Gayle, Alberto Alexander; Tsuda, Tsukasa.

Asia Pac Fam Med ; 14(1): 6, 2015.

Artigo em Inglês | MEDLINE | ID: mdl-26097414

RESUMO

BACKGROUND: Patients cannot always share all necessary relevant information with doctors during medical consultations. Regardless, in order to ensure the best quality consultation and care, it is imperative that a doctor clearly understands each patient's agenda. The purpose of this study was to analyze the process of developing a shared-agenda during family physician consultations in Japan. METHODS: We interviewed 15 first time patients visiting the outpatient clinic of the Department of Family Medicine in the hospital chosen for the investigation, and the 8 family physicians who examined them. In total we observed 16 consultations. We analyzed both patients' and doctors' narratives using a modified grounded theory approach. RESULTS: For patients, we found four main factors that influenced the process of making a shared-agenda: past medical experiences, undisclosed but relevant information, relationship with the family physician, and the patient's own explanatory model. In addition, we found five factors that influenced the shared agenda making process for family physicians: understanding the patient's explanatory model, constructing the patient-doctor relationship, physical examination centered around the patient's explanatory model, discussion-styled explanation, and self-reflection on action. CONCLUSIONS: The findings suggest that patient satisfaction would be increased if family physicians are proactive in considering these factors with respect to both the patient's agenda, and their own.

RESUMO

RESUMO

Assuntos

RESUMO

RESUMO

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA