Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 10 de 10
Filter
Add more filters










Publication year range
1.
J Cheminform ; 16(1): 42, 2024 Apr 15.
Article in English | MEDLINE | ID: mdl-38622746

ABSTRACT

PURPOSE: Wiswesser Line Notation (WLN) is a old line notation for encoding chemical compounds for storage and processing by computers. Whilst the notation itself has long since been surpassed by SMILES and InChI, distribution of WLN during its active years was extensive. In the context of modernising chemical data, we present a comprehensive WLN parser developed using the OpenBabel toolkit, capable of translating WLN strings into various formats supported by the library. Furthermore, we have devised a specialised Finite State Machine l, constructed from the rules of WLN, enabling the recognition and extraction of chemical strings out of large bodies of text. Available open-access WLN data with corresponding SMILES or InChI notation is rare, however ChEMBL, ChemSpider and PubChem all contain WLN records which were used for conversion scoring. Our investigation revealed a notable proportion of inaccuracies within the database entries, and we have taken steps to rectify these errors whenever feasible. SCIENTIFIC CONTRIBUTION: Tools for both the extraction and conversion of WLN from chemical documents have been successfully developed. Both the Deterministic Finite Automaton (DFA) and parser handle the majority of WLN rules officially endorsed in the three major WLN manuals, with the parser showing a clear jump in accuracy and chemical coverage over previous submissions. The GitHub repository can be found here: https://github.com/Mblakey/wiswesser .

2.
J Biomed Semantics ; 14(1): 10, 2023 08 11.
Article in English | MEDLINE | ID: mdl-37568227

ABSTRACT

With the capacity to produce and record data electronically, Scientific research and the data associated with it have grown at an unprecedented rate. However, despite a decent amount of data now existing in an electronic form, it is still common for scientific research to be recorded in an unstructured text format with inconsistent context (vocabularies) which vastly reduces the potential for direct intelligent analysis. Research has demonstrated that the use of semantic technologies such as ontologies to structure and enrich scientific data can greatly improve this potential. However, whilst there are many ontologies that can be used for this purpose, there is still a vast quantity of scientific terminology that does not have adequate semantic representation. A key area for expansion identified by the authors was the pharmacokinetic/pharmacodynamic (PK/PD) domain due to its high usage across many areas of Pharma. As such we have produced a set of these terms and other bioassay related terms to be incorporated into the BioAssay Ontology (BAO), which was identified as the most relevant ontology for this work. A number of use cases developed by experts in the field were used to demonstrate how these new ontology terms can be used, and to set the scene for the continuation of this work with a look to expanding this work out into further relevant domains. The work done in this paper was part of Phase 1 of the SEED project (Semantically Enriching electronic laboratory notebook (eLN) Data).


Subject(s)
Biological Assay , Semantics , Workflow
3.
J Cheminform ; 14(1): 59, 2022 Sep 01.
Article in English | MEDLINE | ID: mdl-36050750

ABSTRACT

The related problems of chemical reaction optimization and reaction scope search concern the discovery of reaction pathways and conditions that provide the best percentage yield of a target product. The space of possible reaction pathways or conditions is too large to search in full, so identifying a globally optimal set of conditions must instead draw on mathematical methods to identify areas of the space that should be investigated. An intriguing contribution to this area of research is the recent development of the Experimental Design for Bayesian optimization (EDBO) optimizer [1]. Bayesian optimization works by building an approximation to the true function to be optimized based on a small set of simulations, and selecting the next point (or points) to be tested based on an acquisition function reflecting the value of different points within the input space. In this work, we evaluated the robustness of the EDBO optimizer under several changes to its specification. We investigated the effect on the performance of the optimizer of altering the acquisition function and batch size, applied the method to other existing reaction yield data sets, and considered its performance in the new problem domain of molecular power conversion efficiency in photovoltaic cells. Our results indicated that the EDBO optimizer broadly performs well under these changes; of particular note is the competitive performance of the computationally cheaper acquisition function Thompson Sampling when compared to the original Expected Improvement function, and some concerns around the method's performance for "incomplete" input domains.

4.
BMC Res Notes ; 15(1): 20, 2022 Jan 21.
Article in English | MEDLINE | ID: mdl-35063017

ABSTRACT

Research data management (RDM) is the cornerstone of a successful research project, and yet it often remains an underappreciated art that gets overlooked in the hustle and bustle of everyday project management even when required by funding bodies. If researchers are to strive for reproducible science that adheres to the principles of FAIR, then they need to manage the data associated with their research projects effectively. It is imperative to plan your RDM strategies early on, and setup your project organisation before embarking on the work. There are several different factors to consider: data management plans, data organisation and storage, publishing and sharing your data, ensuring reproducibility and adhering to data standards. Additionally it is important to reflect upon the ethical implications that might need to be planned for, and adverse issues that may need a mitigation strategy. This short article discusses these different areas, noting some best practices and detailing how to incorporate these strategies into your work. Finally, the article ends with a set of top ten tips for effective research data management.


Subject(s)
Data Management , Research Personnel , Humans , Publishing , Reproducibility of Results , Research Design
5.
Annu Rev Phys Chem ; 73: 97-116, 2022 04 20.
Article in English | MEDLINE | ID: mdl-34882434

ABSTRACT

As the volume of data associated with scientific research has exploded over recent years, the use of digital infrastructures to support this research and the data underpinning it has increased significantly. Physical chemists have been making use of eScience infrastructures since their conception, but in the last five years their usage has increased even more. While these infrastructures have not greatly affected the chemistry itself, they have in some cases had a significant impact on how the research is undertaken. The combination of the human effort of collaboration to create open source software tools and semantic resources, the increased availability of hardware for the laboratories, and the range of data management tools available has made the life of a physical chemist significantly easier. This review considers the different aspects of eScience infrastructures and explores how they have improved the way in which we can conduct physical chemistry research.


Subject(s)
Semantics , Software , Chemistry, Physical , Humans
6.
Patterns (N Y) ; 2(11): 100335, 2021 Nov 12.
Article in English | MEDLINE | ID: mdl-34820642

ABSTRACT

The Internet of Food Things Network+ (IoFT) and the Artificial Intelligence and Augmented Intelligence for Automated Investigation for Scientific Discovery Network+ (AI3SD) brought together an interdisciplinary multi-institution working group to create an ethical framework for digital collaboration in the food industry. This will enable the exploration of implications and consequences (both intentional and unintentional) of using cutting-edge technologies to support the implementation of data trusts and other forms of digital collaboration in the food sector. This article describes how we identified areas for ethical consideration with respect to digital collaboration and the use of Industry 4.0 technologies in the food sector and describes the different interdisciplinary methodologies being used to produce this framework. The research questions and objectives that are being addressed by the working group are laid out, with a report on our ongoing work. The article concludes with recommendations about working on projects in this area.

7.
Patterns (N Y) ; 2(1): 100162, 2021 Jan 08.
Article in English | MEDLINE | ID: mdl-33511363

ABSTRACT

The Artificial Intelligence and Augmented Intelligence for Automated Investigation for Scientific Discovery Network+ (AI3SD) was established in response to the UK Engineering and Physical Sciences Research Council (EPSRC) late-2017 call for a Network+ to promote cutting-edge research in artificial intelligence to accelerate groundbreaking scientific discoveries. This article provides the philosophical, scientific, and technical underpinnings of the Network+, the history of the different domains represented in the Network+, and the specific focus of the Network+. The activities, collaborations, and research covered in the first year of the Network+ have highlighted the significant challenges in the chemistry and augmented and artificial intelligence space. These challenges are shaping the future directions of the Network+. The article concludes with a summary of the lessons learned in running this Network+ and introduces our plans for the future in a landscape redrawn by COVID-19, including rebranding into the AI 4 Scientific Discovery Network (www.ai4science.network).

8.
Expert Opin Drug Discov ; 14(5): 433-444, 2019 05.
Article in English | MEDLINE | ID: mdl-30884989

ABSTRACT

INTRODUCTION: The use of semantic web technologies to aid drug discovery has gained momentum over recent years. Researchers in this domain have realized that semantic web technologies are key to dealing with the high levels of data for drug discovery. These technologies enable us to represent the data in a formal, structured, interoperable and comparable way, and to tease out undiscovered links between drug data (be it identifying new drug-targets or relevant compounds, or links between specific drugs and diseases). Areas covered: This review focuses on explaining how semantic web technologies are being used to aid advances in drug discovery. The main types of semantic web technologies are explained, outlining how they work and how they can be used in the drug discovery process, with a consideration of how the use of these technologies has progressed from their initial usage. Expert opinion: The increased availability of shared semantic resources (tools, data and importantly the communities) have enabled the application of semantic web technologies to facilitate semantic (context dependent) search across multiple data sources, which can be used by machine learning to produce better predictions by exploiting the semantic links in knowledge graphs and linked datasets.


Subject(s)
Drug Discovery/methods , Semantic Web , Datasets as Topic , Humans , Machine Learning
9.
J Cheminform ; 11(1): 23, 2019 Mar 21.
Article in English | MEDLINE | ID: mdl-30900066

ABSTRACT

Scientific research is increasingly characterised by the volume of documents and data that it produces, from experimental plans and raw data to reports and papers. Researchers frequently struggle to manage and curate these materials, both individually and collectively. Previous studies of Electronic Lab Notebooks (ELNs) in academia and industry have identified semantic web technologies as a means for organising scientific documents to improve current workflows and knowledge management practices. In this paper, we present a qualitative, user-centred study of researcher requirements and practices, based on a series of discipline-specific focus groups. We developed a prototype semantic ELN to serve as a discussion aid for these focus groups, and to help us explore the technical readiness of a range of semantic web technologies. While these technologies showed potential, existing tools for semantic annotation were not well-received by our focus groups, and need to be refined before they can be used to enhance current researcher practices. In addition, the seemingly simple notion of "tagging and searching" documents appears anything but; the researchers in our focus groups had extremely personal requirements for how they organise their work, so the successful incorporation of semantic web technologies into their practices must permit a significant degree of customisation and personalisation.

10.
J Cheminform ; 9(1): 31, 2017 May 24.
Article in English | MEDLINE | ID: mdl-29086051

ABSTRACT

Despite the increasingly digital nature of society there are some areas of research that remain firmly rooted in the past; in this case the laboratory notebook, the last remaining paper component of an experiment. Countless electronic laboratory notebooks (ELNs) have been created in an attempt to digitise record keeping processes in the lab, but none of them have become a 'key player' in the ELN market, due to the many adoption barriers that have been identified in previous research and further explored in the user studies presented here. The main issues identified are the cost of the current available ELNs, their ease of use (or lack of it) and their accessibility issues across different devices and operating systems. Evidence suggests that whilst scientists willingly make use of generic notebooking software, spreadsheets and other general office and scientific tools to aid their work, current ELNs are lacking in the required functionality to meet the needs of the researchers. In this paper we present our extensive research and user study results to propose an ELN built upon a pre-existing cloud notebook platform that makes use of accessible popular scientific software and semantic web technologies to help overcome the identified barriers to adoption.

SELECTION OF CITATIONS
SEARCH DETAIL
...