Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 7 de 7
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
Sci Data ; 10(1): 528, 2023 08 08.
Artigo em Inglês | MEDLINE | ID: mdl-37553439

RESUMO

Large language models (LLMs) such as GPT-4 have recently demonstrated impressive results across a wide range of tasks. LLMs are still limited, however, in that they frequently fail at complex reasoning, their reasoning processes are opaque, they are prone to 'hallucinate' facts, and there are concerns about their underlying biases. Letting models verbalize reasoning steps as natural language, a technique known as chain-of-thought prompting, has recently been proposed as a way to address some of these issues. Here we present ThoughtSource, a meta-dataset and software library for chain-of-thought (CoT) reasoning. The goal of ThoughtSource is to improve future artificial intelligence systems by facilitating qualitative understanding of CoTs, enabling empirical evaluations, and providing training data. This first release of ThoughtSource integrates seven scientific/medical, three general-domain and five math word question answering datasets.

2.
Nat Commun ; 13(1): 6793, 2022 11 10.
Artigo em Inglês | MEDLINE | ID: mdl-36357391

RESUMO

Benchmarks are crucial to measuring and steering progress in artificial intelligence (AI). However, recent studies raised concerns over the state of AI benchmarking, reporting issues such as benchmark overfitting, benchmark saturation and increasing centralization of benchmark dataset creation. To facilitate monitoring of the health of the AI benchmarking ecosystem, we introduce methodologies for creating condensed maps of the global dynamics of benchmark creation and saturation. We curate data for 3765 benchmarks covering the entire domains of computer vision and natural language processing, and show that a large fraction of benchmarks quickly trends towards near-saturation, that many benchmarks fail to find widespread utilization, and that benchmark performance gains for different AI tasks are prone to unforeseen bursts. We analyze attributes associated with benchmark popularity, and conclude that future benchmarks should emphasize versatility, breadth and real-world utility.


Assuntos
Inteligência Artificial , Benchmarking , Benchmarking/métodos , Ecossistema , Fenômenos Físicos
3.
PLoS One ; 17(6): e0268534, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-35675343

RESUMO

BACKGROUND: The clinical implementation of pharmacogenomics (PGx) could be one of the first milestones towards realizing personalized medicine in routine care. However, its widespread adoption requires the availability of suitable clinical decision support (CDS) systems, which is often impeded by the fragmentation or absence of adequate health IT infrastructures. We report results of CDS implementation in the large-scale European research project Ubiquitous Pharmacogenomics (U-PGx), in which PGx CDS was rolled out and evaluated across more than 15 clinical sites in the Netherlands, Spain, Slovenia, Italy, Greece, United Kingdom and Austria, covering a wide variety of healthcare settings. METHODS: We evaluated the CDS implementation process through qualitative and quantitative process indicators. Quantitative indicators included statistics on generated PGx reports, median time from sampled upload until report delivery and statistics on report retrievals via the mobile-based CDS tool. Adoption of different CDS tools, uptake and usability were further investigated through a user survey among healthcare providers. Results of a risk assessment conducted prior to the implementation process were retrospectively analyzed and compared to actual encountered difficulties and their impact. RESULTS: As of March 2021, personalized PGx reports were produced from 6884 genotyped samples with a median delivery time of twenty minutes. Out of 131 invited healthcare providers, 65 completed the questionnaire (response rate: 49.6%). Overall satisfaction rates with the different CDS tools varied between 63.6% and 85.2% per tool. Delays in implementation were caused by challenges including institutional factors and complexities in the development of required tools and reference data resources, such as genotype-phenotype mappings. CONCLUSIONS: We demonstrated the feasibility of implementing a standardized PGx decision support solution in a multinational, multi-language and multi-center setting. Remaining challenges for future wide-scale roll-out include the harmonization of existing PGx information in guidelines and drug labels, the need for strategies to lower the barrier of PGx CDS adoption for healthcare institutions and providers, and easier compliance with regulatory and legal frameworks.


Assuntos
Sistemas de Apoio a Decisões Clínicas , Farmacogenética , Farmacogenética/métodos , Medicina de Precisão/métodos , Estudos Retrospectivos , Software
4.
Sci Data ; 9(1): 322, 2022 06 17.
Artigo em Inglês | MEDLINE | ID: mdl-35715466

RESUMO

Research in artificial intelligence (AI) is addressing a growing number of tasks through a rapidly growing number of models and methodologies. This makes it difficult to keep track of where novel AI methods are successfully - or still unsuccessfully - applied, how progress is measured, how different advances might synergize with each other, and how future research should be prioritized. To help address these issues, we created the Intelligence Task Ontology and Knowledge Graph (ITO), a comprehensive, richly structured and manually curated resource on artificial intelligence tasks, benchmark results and performance metrics. The current version of ITO contains 685,560 edges, 1,100 classes representing AI processes and 1,995 properties representing performance metrics. The primary goal of ITO is to enable analyses of the global landscape of AI tasks and capabilities. ITO is based on technologies that allow for easy integration and enrichment with external data, automated inference and continuous, collaborative expert curation of underlying ontological models. We make the ITO dataset and a collection of Jupyter notebooks utilizing ITO openly available.

5.
Bioinformatics ; 38(8): 2371-2373, 2022 04 12.
Artigo em Inglês | MEDLINE | ID: mdl-35139158

RESUMO

SUMMARY: Machine learning algorithms for link prediction can be valuable tools for hypothesis generation. However, many current algorithms are black boxes or lack good user interfaces that could facilitate insight into why predictions are made. We present LinkExplorer, a software suite for predicting, explaining and exploring links in large biomedical knowledge graphs. LinkExplorer integrates our novel, rule-based link prediction engine SAFRAN, which was recently shown to outcompete other explainable algorithms and established black-box algorithms. Here, we demonstrate highly competitive evaluation results of our algorithm on multiple large biomedical knowledge graphs, and release a web interface that allows for interactive and intuitive exploration of predicted links and their explanations. AVAILABILITY AND IMPLEMENTATION: A publicly hosted instance, source code and further documentation can be found at https://github.com/OpenBioLink/Explorer. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Algoritmos , Reconhecimento Automatizado de Padrão , Software , Aprendizado de Máquina , Documentação
6.
Bioinformatics ; 36(13): 4097-4098, 2020 07 01.
Artigo em Inglês | MEDLINE | ID: mdl-32339214

RESUMO

SUMMARY: Recently, novel machine-learning algorithms have shown potential for predicting undiscovered links in biomedical knowledge networks. However, dedicated benchmarks for measuring algorithmic progress have not yet emerged. With OpenBioLink, we introduce a large-scale, high-quality and highly challenging biomedical link prediction benchmark to transparently and reproducibly evaluate such algorithms. Furthermore, we present preliminary baseline evaluation results. AVAILABILITY AND IMPLEMENTATION: Source code and data are openly available at https://github.com/OpenBioLink/OpenBioLink. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Benchmarking , Software , Algoritmos , Aprendizado de Máquina
7.
Stud Health Technol Inform ; 260: 226-233, 2019.
Artigo em Inglês | MEDLINE | ID: mdl-31118342

RESUMO

BACKGROUND: Reuse of EHR data for selecting patients who are eligible for clinical research can substantially improve the recruitment process. ART-DECOR is an open-source tool that is commonly used to design and publish HL7 V3 templates of national (e.g. ELGA) and international EHR initiatives. OBJECTIVES: Extend ART-DECOR to allow the definition of criteria that may be used for patient selection. METHODS: Using the native ART-DECOR development framework we extended existing ART-DECOR template associations by allowing conditions to be formulated. RESULTS: An editor for the specification of conditions was implemented. The resulting criteria are internally translated to XPath expressions and can be immediately applied to CDA documents. As a prototypical application of our approach we implemented a "Trial Criteria Evaluator" tool that allows trial eligibility criteria to be composed of our ART-DECOR criteria and have them checked against a patient's CDA documents. CONCLUSION: Referring to HL7 templates, our criteria can be applied to documents of national EHR systems such as ELGA and hereby reach a broad patient cohort. Implementing our approach within ART-DECOR alleviates its reuse and enhancement by other researchers.


Assuntos
Registros Eletrônicos de Saúde , Seleção de Pacientes , Vocabulário Controlado , Atenção à Saúde , Humanos
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...