Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 60
Filtrar
1.
Comput Biol Med ; 178: 108796, 2024 Jun 21.
Artigo em Inglês | MEDLINE | ID: mdl-38909448

RESUMO

BACKGROUND: Computational simulation of biological processes can be a valuable tool for accelerating biomedical research, but usually requires extensive domain knowledge and manual adaptation. Large language models (LLMs) such as GPT-4 have proven surprisingly successful for a wide range of tasks. This study provides proof-of-concept for the use of GPT-4 as a versatile simulator of biological systems. METHODS: We introduce SimulateGPT, a proof-of-concept for knowledge-driven simulation across levels of biological organization through structured prompting of GPT-4. We benchmarked our approach against direct GPT-4 inference in blinded qualitative evaluations by domain experts in four scenarios and in two quantitative scenarios with experimental ground truth. The qualitative scenarios included mouse experiments with known outcomes and treatment decision support in sepsis. The quantitative scenarios included prediction of gene essentiality in cancer cells and progression-free survival in cancer patients. RESULTS: In qualitative experiments, biomedical scientists rated SimulateGPT's predictions favorably over direct GPT-4 inference. In quantitative experiments, SimulateGPT substantially improved classification accuracy for predicting the essentiality of individual genes and increased correlation coefficients and precision in the regression task of predicting progression-free survival. CONCLUSION: This proof-of-concept study suggests that LLMs may enable a new class of biomedical simulators. Such text-based simulations appear well suited for modeling and understanding complex living systems that are difficult to describe with physics-based first-principles simulations, but for which extensive knowledge is available as written text. Finally, we propose several directions for further development of LLM-based biomedical simulators, including augmentation through web search retrieval, integrated mathematical modeling, and fine-tuning on experimental data.

2.
PeerJ Comput Sci ; 10: e1999, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38855241

RESUMO

Emergent chain-of-thought (CoT) reasoning capabilities promise to improve the performance and explainability of large language models (LLMs). However, uncertainties remain about how reasoning strategies formulated for previous model generations generalize to new model generations and different datasets. In this small-scale study, we compare different reasoning strategies induced by zero-shot prompting across six recently released LLMs (davinci-002, davinci-003, GPT-3.5-turbo, GPT-4, Flan-T5-xxl and Cohere command-xlarge). We test them on six question-answering datasets that require real-world knowledge application and logical verbal reasoning, including datasets from scientific and medical domains. Our findings demonstrate that while some variations in effectiveness occur, gains from CoT reasoning strategies remain robust across different models and datasets. GPT-4 benefits the most from current state-of-the-art reasoning strategies and performs best by applying a prompt previously discovered through automated discovery.

3.
Sci Data ; 10(1): 528, 2023 08 08.
Artigo em Inglês | MEDLINE | ID: mdl-37553439

RESUMO

Large language models (LLMs) such as GPT-4 have recently demonstrated impressive results across a wide range of tasks. LLMs are still limited, however, in that they frequently fail at complex reasoning, their reasoning processes are opaque, they are prone to 'hallucinate' facts, and there are concerns about their underlying biases. Letting models verbalize reasoning steps as natural language, a technique known as chain-of-thought prompting, has recently been proposed as a way to address some of these issues. Here we present ThoughtSource, a meta-dataset and software library for chain-of-thought (CoT) reasoning. The goal of ThoughtSource is to improve future artificial intelligence systems by facilitating qualitative understanding of CoTs, enabling empirical evaluations, and providing training data. This first release of ThoughtSource integrates seven scientific/medical, three general-domain and five math word question answering datasets.

4.
Lancet ; 401(10374): 347-356, 2023 02 04.
Artigo em Inglês | MEDLINE | ID: mdl-36739136

RESUMO

BACKGROUND: The benefit of pharmacogenetic testing before starting drug therapy has been well documented for several single gene-drug combinations. However, the clinical utility of a pre-emptive genotyping strategy using a pharmacogenetic panel has not been rigorously assessed. METHODS: We conducted an open-label, multicentre, controlled, cluster-randomised, crossover implementation study of a 12-gene pharmacogenetic panel in 18 hospitals, nine community health centres, and 28 community pharmacies in seven European countries (Austria, Greece, Italy, the Netherlands, Slovenia, Spain, and the UK). Patients aged 18 years or older receiving a first prescription for a drug clinically recommended in the guidelines of the Dutch Pharmacogenetics Working Group (ie, the index drug) as part of routine care were eligible for inclusion. Exclusion criteria included previous genetic testing for a gene relevant to the index drug, a planned duration of treatment of less than 7 consecutive days, and severe renal or liver insufficiency. All patients gave written informed consent before taking part in the study. Participants were genotyped for 50 germline variants in 12 genes, and those with an actionable variant (ie, a drug-gene interaction test result for which the Dutch Pharmacogenetics Working Group [DPWG] recommended a change to standard-of-care drug treatment) were treated according to DPWG recommendations. Patients in the control group received standard treatment. To prepare clinicians for pre-emptive pharmacogenetic testing, local teams were educated during a site-initiation visit and online educational material was made available. The primary outcome was the occurrence of clinically relevant adverse drug reactions within the 12-week follow-up period. Analyses were irrespective of patient adherence to the DPWG guidelines. The primary analysis was done using a gatekeeping analysis, in which outcomes in people with an actionable drug-gene interaction in the study group versus the control group were compared, and only if the difference was statistically significant was an analysis done that included all of the patients in the study. Outcomes were compared between the study and control groups, both for patients with an actionable drug-gene interaction test result (ie, a result for which the DPWG recommended a change to standard-of-care drug treatment) and for all patients who received at least one dose of index drug. The safety analysis included all participants who received at least one dose of a study drug. This study is registered with ClinicalTrials.gov, NCT03093818 and is closed to new participants. FINDINGS: Between March 7, 2017, and June 30, 2020, 41 696 patients were assessed for eligibility and 6944 (51·4 % female, 48·6% male; 97·7% self-reported European, Mediterranean, or Middle Eastern ethnicity) were enrolled and assigned to receive genotype-guided drug treatment (n=3342) or standard care (n=3602). 99 patients (52 [1·6%] of the study group and 47 [1·3%] of the control group) withdrew consent after group assignment. 652 participants (367 [11·0%] in the study group and 285 [7·9%] in the control group) were lost to follow-up. In patients with an actionable test result for the index drug (n=1558), a clinically relevant adverse drug reaction occurred in 152 (21·0%) of 725 patients in the study group and 231 (27·7%) of 833 patients in the control group (odds ratio [OR] 0·70 [95% CI 0·54-0·91]; p=0·0075), whereas for all patients, the incidence was 628 (21·5%) of 2923 patients in the study group and 934 (28·6%) of 3270 patients in the control group (OR 0·70 [95% CI 0·61-0·79]; p <0·0001). INTERPRETATION: Genotype-guided treatment using a 12-gene pharmacogenetic panel significantly reduced the incidence of clinically relevant adverse drug reactions and was feasible across diverse European health-care system organisations and settings. Large-scale implementation could help to make drug therapy increasingly safe. FUNDING: European Union Horizon 2020.


Assuntos
Efeitos Colaterais e Reações Adversas Relacionados a Medicamentos , Farmacogenética , Humanos , Masculino , Feminino , Testes Genéticos , Genótipo , Combinação de Medicamentos , Efeitos Colaterais e Reações Adversas Relacionados a Medicamentos/prevenção & controle , Resultado do Tratamento
5.
J Biomed Inform ; 137: 104274, 2023 01.
Artigo em Inglês | MEDLINE | ID: mdl-36539106

RESUMO

Publicly accessible benchmarks that allow for assessing and comparing model performances are important drivers of progress in artificial intelligence (AI). While recent advances in AI capabilities hold the potential to transform medical practice by assisting and augmenting the cognitive processes of healthcare professionals, the coverage of clinically relevant tasks by AI benchmarks is largely unclear. Furthermore, there is a lack of systematized meta-information that allows clinical AI researchers to quickly determine accessibility, scope, content and other characteristics of datasets and benchmark datasets relevant to the clinical domain. To address these issues, we curated and released a comprehensive catalogue of datasets and benchmarks pertaining to the broad domain of clinical and biomedical natural language processing (NLP), based on a systematic review of literature and. A total of 450 NLP datasets were manually systematized and annotated with rich metadata, such as targeted tasks, clinical applicability, data types, performance metrics, accessibility and licensing information, and availability of data splits. We then compared tasks covered by AI benchmark datasets with relevant tasks that medical practitioners reported as highly desirable targets for automation in a previous empirical study. Our analysis indicates that AI benchmarks of direct clinical relevance are scarce and fail to cover most work activities that clinicians want to see addressed. In particular, tasks associated with routine documentation and patient data administration workflows are not represented despite significant associated workloads. Thus, currently available AI benchmarks are improperly aligned with desired targets for AI automation in clinical settings, and novel benchmarks should be created to fill these gaps.


Assuntos
Inteligência Artificial , Benchmarking , Humanos , Processamento de Linguagem Natural
6.
Nat Commun ; 13(1): 6793, 2022 11 10.
Artigo em Inglês | MEDLINE | ID: mdl-36357391

RESUMO

Benchmarks are crucial to measuring and steering progress in artificial intelligence (AI). However, recent studies raised concerns over the state of AI benchmarking, reporting issues such as benchmark overfitting, benchmark saturation and increasing centralization of benchmark dataset creation. To facilitate monitoring of the health of the AI benchmarking ecosystem, we introduce methodologies for creating condensed maps of the global dynamics of benchmark creation and saturation. We curate data for 3765 benchmarks covering the entire domains of computer vision and natural language processing, and show that a large fraction of benchmarks quickly trends towards near-saturation, that many benchmarks fail to find widespread utilization, and that benchmark performance gains for different AI tasks are prone to unforeseen bursts. We analyze attributes associated with benchmark popularity, and conclude that future benchmarks should emphasize versatility, breadth and real-world utility.


Assuntos
Inteligência Artificial , Benchmarking , Benchmarking/métodos , Ecossistema , Fenômenos Físicos
7.
Neural Netw ; 154: 310-322, 2022 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-35930855

RESUMO

Computational sleep scoring from multimodal neurophysiological time-series (polysomnography PSG) has achieved impressive clinical success. Models that use only a single electroencephalographic (EEG) channel from PSG have not yet received the same clinical recognition, since they lack Rapid Eye Movement (REM) scoring quality. The question whether this lack can be remedied at all remains an important one. We conjecture that predominant Long Short-Term Memory (LSTM) models do not adequately represent distant REM EEG segments (termed epochs), since LSTMs compress these to a fixed-size vector from separate past and future sequences. To this end, we introduce the EEG representation model ENGELBERT (electroEncephaloGraphic Epoch Local Bidirectional Encoder Representations from Transformer). It jointly attends to multiple EEG epochs from both past and future. Compared to typical token sequences in language, for which attention models have originally been conceived, overnight EEG sequences easily span more than 1000 30 s epochs. Local attention on overlapping windows reduces the critical quadratic computational complexity to linear, enabling versatile sub-one-hour to all-day scoring. ENGELBERT is at least one order of magnitude smaller than established LSTM models and is easy to train from scratch in a single phase. It surpassed state-of-the-art macro F1-scores in 3 single-EEG sleep scoring experiments. REM F1-scores were pushed to at least 86%. ENGELBERT virtually closed the gap to PSG-based methods from 4-5 percentage points (pp) to less than 1 pp F1-score.


Assuntos
Eletroencefalografia , Fases do Sono , Eletroencefalografia/métodos , Polissonografia/métodos , Sono/fisiologia , Fases do Sono/fisiologia , Sono REM/fisiologia
8.
PLoS One ; 17(6): e0268534, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-35675343

RESUMO

BACKGROUND: The clinical implementation of pharmacogenomics (PGx) could be one of the first milestones towards realizing personalized medicine in routine care. However, its widespread adoption requires the availability of suitable clinical decision support (CDS) systems, which is often impeded by the fragmentation or absence of adequate health IT infrastructures. We report results of CDS implementation in the large-scale European research project Ubiquitous Pharmacogenomics (U-PGx), in which PGx CDS was rolled out and evaluated across more than 15 clinical sites in the Netherlands, Spain, Slovenia, Italy, Greece, United Kingdom and Austria, covering a wide variety of healthcare settings. METHODS: We evaluated the CDS implementation process through qualitative and quantitative process indicators. Quantitative indicators included statistics on generated PGx reports, median time from sampled upload until report delivery and statistics on report retrievals via the mobile-based CDS tool. Adoption of different CDS tools, uptake and usability were further investigated through a user survey among healthcare providers. Results of a risk assessment conducted prior to the implementation process were retrospectively analyzed and compared to actual encountered difficulties and their impact. RESULTS: As of March 2021, personalized PGx reports were produced from 6884 genotyped samples with a median delivery time of twenty minutes. Out of 131 invited healthcare providers, 65 completed the questionnaire (response rate: 49.6%). Overall satisfaction rates with the different CDS tools varied between 63.6% and 85.2% per tool. Delays in implementation were caused by challenges including institutional factors and complexities in the development of required tools and reference data resources, such as genotype-phenotype mappings. CONCLUSIONS: We demonstrated the feasibility of implementing a standardized PGx decision support solution in a multinational, multi-language and multi-center setting. Remaining challenges for future wide-scale roll-out include the harmonization of existing PGx information in guidelines and drug labels, the need for strategies to lower the barrier of PGx CDS adoption for healthcare institutions and providers, and easier compliance with regulatory and legal frameworks.


Assuntos
Sistemas de Apoio a Decisões Clínicas , Farmacogenética , Farmacogenética/métodos , Medicina de Precisão/métodos , Estudos Retrospectivos , Software
9.
J Biomed Inform ; 132: 104114, 2022 08.
Artigo em Inglês | MEDLINE | ID: mdl-35717011

RESUMO

Deep transformer neural network models have improved the predictive accuracy of intelligent text processing systems in the biomedical domain. They have obtained state-of-the-art performance scores on a wide variety of biomedical and clinical Natural Language Processing (NLP) benchmarks. However, the robustness and reliability of these models has been less explored so far. Neural NLP models can be easily fooled by adversarial samples, i.e. minor changes to input that preserve the meaning and understandability of the text but force the NLP system to make erroneous decisions. This raises serious concerns about the security and trust-worthiness of biomedical NLP systems, especially when they are intended to be deployed in real-world use cases. We investigated the robustness of several transformer neural language models, i.e. BioBERT, SciBERT, BioMed-RoBERTa, and Bio-ClinicalBERT, on a wide range of biomedical and clinical text processing tasks. We implemented various adversarial attack methods to test the NLP systems in different attack scenarios. Experimental results showed that the biomedical NLP models are sensitive to adversarial samples; their performance dropped in average by 21 and 18.9 absolute percent on character-level and word-level adversarial noise, respectively, on Micro-F1, Pearson Correlation, and Accuracy measures. Conducting extensive adversarial training experiments, we fine-tuned the NLP models on a mixture of clean samples and adversarial inputs. Results showed that adversarial training is an effective defense mechanism against adversarial noise; the models' robustness improved in average by 11.3 absolute percent. In addition, the models' performance on clean data increased in average by 2.4 absolute percent, demonstrating that adversarial training can boost generalization abilities of biomedical NLP systems. This study takes an important step towards revealing vulnerabilities of deep neural language models in biomedical NLP applications. It also provides practical and effective strategies to develop secure, trust-worthy, and accurate intelligent text processing systems in the biomedical domain.


Assuntos
Idioma , Processamento de Linguagem Natural , Redes Neurais de Computação , Reprodutibilidade dos Testes
10.
Sci Data ; 9(1): 322, 2022 06 17.
Artigo em Inglês | MEDLINE | ID: mdl-35715466

RESUMO

Research in artificial intelligence (AI) is addressing a growing number of tasks through a rapidly growing number of models and methodologies. This makes it difficult to keep track of where novel AI methods are successfully - or still unsuccessfully - applied, how progress is measured, how different advances might synergize with each other, and how future research should be prioritized. To help address these issues, we created the Intelligence Task Ontology and Knowledge Graph (ITO), a comprehensive, richly structured and manually curated resource on artificial intelligence tasks, benchmark results and performance metrics. The current version of ITO contains 685,560 edges, 1,100 classes representing AI processes and 1,995 properties representing performance metrics. The primary goal of ITO is to enable analyses of the global landscape of AI tasks and capabilities. ITO is based on technologies that allow for easy integration and enrichment with external data, automated inference and continuous, collaborative expert curation of underlying ontological models. We make the ITO dataset and a collection of Jupyter notebooks utilizing ITO openly available.

11.
Bioinformatics ; 38(8): 2371-2373, 2022 04 12.
Artigo em Inglês | MEDLINE | ID: mdl-35139158

RESUMO

SUMMARY: Machine learning algorithms for link prediction can be valuable tools for hypothesis generation. However, many current algorithms are black boxes or lack good user interfaces that could facilitate insight into why predictions are made. We present LinkExplorer, a software suite for predicting, explaining and exploring links in large biomedical knowledge graphs. LinkExplorer integrates our novel, rule-based link prediction engine SAFRAN, which was recently shown to outcompete other explainable algorithms and established black-box algorithms. Here, we demonstrate highly competitive evaluation results of our algorithm on multiple large biomedical knowledge graphs, and release a web interface that allows for interactive and intuitive exploration of predicted links and their explanations. AVAILABILITY AND IMPLEMENTATION: A publicly hosted instance, source code and further documentation can be found at https://github.com/OpenBioLink/Explorer. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Algoritmos , Reconhecimento Automatizado de Padrão , Software , Aprendizado de Máquina , Documentação
12.
BMC Med Res Methodol ; 21(1): 284, 2021 12 18.
Artigo em Inglês | MEDLINE | ID: mdl-34922459

RESUMO

BACKGROUND: While machine learning (ML) algorithms may predict cardiovascular outcomes more accurately than statistical models, their result is usually not representable by a transparent formula. Hence, it is often unclear how specific values of predictors lead to the predictions. We aimed to demonstrate with graphical tools how predictor-risk relations in cardiovascular risk prediction models fitted by ML algorithms and by statistical approaches may differ, and how sample size affects the stability of the estimated relations. METHODS: We reanalyzed data from a large registry of 1.5 million participants in a national health screening program. Three data analysts developed analytical strategies to predict cardiovascular events within 1 year from health screening. This was done for the full data set and with gradually reduced sample sizes, and each data analyst followed their favorite modeling approach. Predictor-risk relations were visualized by partial dependence and individual conditional expectation plots. RESULTS: When comparing the modeling algorithms, we found some similarities between these visualizations but also occasional divergence. The smaller the sample size, the more the predictor-risk relation depended on the modeling algorithm used, and also sampling variability played an increased role. Predictive performance was similar if the models were derived on the full data set, whereas smaller sample sizes favored simpler models. CONCLUSION: Predictor-risk relations from ML models may differ from those obtained by statistical models, even with large sample sizes. Hence, predictors may assume different roles in risk prediction models. As long as sample size is sufficient, predictive accuracy is not largely affected by the choice of algorithm.


Assuntos
Doenças Cardiovasculares , Doenças Cardiovasculares/diagnóstico , Doenças Cardiovasculares/epidemiologia , Fatores de Risco de Doenças Cardíacas , Humanos , Aprendizado de Máquina , Modelos Estatísticos , Fatores de Risco
13.
J Pers Med ; 11(12)2021 Dec 01.
Artigo em Inglês | MEDLINE | ID: mdl-34945740

RESUMO

AIMS: We tested the hypothesis that artificial intelligence (AI)-powered algorithms applied to cardiac magnetic resonance (CMR) images could be able to detect the potential patterns of cardiac amyloidosis (CA). Readers in CMR centers with a low volume of referrals for the detection of myocardial storage diseases or a low volume of CMRs, in general, may overlook CA. In light of the growing prevalence of the disease and emerging therapeutic options, there is an urgent need to avoid misdiagnoses. METHODS AND RESULTS: Using CMR data from 502 patients (CA: n = 82), we trained convolutional neural networks (CNNs) to automatically diagnose patients with CA. We compared the diagnostic accuracy of different state-of-the-art deep learning techniques on common CMR imaging protocols in detecting imaging patterns associated with CA. As a result of a 10-fold cross-validated evaluation, the best-performing fine-tuned CNN achieved an average ROC AUC score of 0.96, resulting in a diagnostic accuracy of 94% sensitivity and 90% specificity. CONCLUSIONS: Applying AI to CMR to diagnose CA may set a remarkable milestone in an attempt to establish a fully computational diagnostic path for the diagnosis of CA, in order to support the complex diagnostic work-up requiring a profound knowledge of experts from different disciplines.

14.
IEEE J Biomed Health Inform ; 25(8): 3112-3120, 2021 08.
Artigo em Inglês | MEDLINE | ID: mdl-33534720

RESUMO

In this paper, we propose a novel method named Biomedical Confident Itemsets Explanation (BioCIE), aiming at post-hoc explanation of black-box machine learning models for biomedical text classification. Using sources of domain knowledge and a confident itemset mining method, BioCIE discretizes the decision space of a black-box into smaller subspaces and extracts semantic relationships between the input text and class labels in different subspaces. Confident itemsets discover how biomedical concepts are related to class labels in the black-box's decision space. BioCIE uses the itemsets to approximate the black-box's behavior for individual predictions. Optimizing fidelity, interpretability, and coverage measures, BioCIE produces class-wise explanations that represent decision boundaries of the black-box. Results of evaluations on various biomedical text classification tasks and black-box models demonstrated that BioCIE can outperform perturbation-based and decision set methods in terms of producing concise, accurate, and interpretable explanations. BioCIE improved the fidelity of instance-wise and class-wise explanations by 11.6% and 7.5%, respectively. It also improved the interpretability of explanations by 8%. BioCIE can be effectively used to explain how a black-box biomedical text classification model semantically relates input texts to class labels. The source code and supplementary material are available at https://github.com/mmoradi-iut/BioCIE.


Assuntos
Mineração de Dados , Aprendizado de Máquina , Humanos , Semântica , Software
15.
J Clin Med ; 9(5)2020 May 03.
Artigo em Inglês | MEDLINE | ID: mdl-32375287

RESUMO

(1) Background: Cardiac amyloidosis (CA) is a rare and complex condition with poor prognosis. While novel therapies improve outcomes, many affected individuals remain undiagnosed due to a lack of awareness among clinicians. This study was undertaken to develop an expert-independent machine learning (ML) prediction model for CA relying on routinely determined laboratory parameters. (2) Methods: In a first step, we developed baseline linear models based on logistic regression. In a second step, we used an ML algorithm based on gradient tree boosting to improve our linear prediction model, and to perform non-linear prediction. Then, we compared the performance of all diagnostic algorithms. All prediction models were developed on a training cohort, consisting of patients with proven CA (positive cases, n = 121) and amyloidosis-unrelated heart failure (HF) patients (negative cases, n = 415). Performances of all prediction models were evaluated on a separate prognostic validation cohort with 37 CA-positive and 124 CA-negative patients. (3) Results: Our best model, based on gradient-boosted ensembles of decision trees, achieved an area under the receiver operating characteristic curve (ROC AUC) score of 0.86, with sensitivity and specificity of 89.2% and 78.2%, respectively. The best linear model had an ROC AUC score of 0.75, with sensitivity and specificity of 84.6 and 71.7, respectively. (4) Conclusions: Our work demonstrates that ML makes it possible to utilize basic laboratory parameters to generate a distinct CA-related HF profile compared with CA-unrelated HF patients. This proof-of-concept study opens a potential new avenue in the diagnostic workup of CA and may assist physicians in clinical reasoning.

16.
J Biomed Inform ; 107: 103452, 2020 07.
Artigo em Inglês | MEDLINE | ID: mdl-32439479

RESUMO

Text summarization tools can help biomedical researchers and clinicians reduce the time and effort needed for acquiring important information from numerous documents. It has been shown that the input text can be modeled as a graph, and important sentences can be selected by identifying central nodes within the graph. However, the effective representation of documents, quantifying the relatedness of sentences, and selecting the most informative sentences are main challenges that need to be addressed in graph-based summarization. In this paper, we address these challenges in the context of biomedical text summarization. We evaluate the efficacy of a graph-based summarizer using different types of context-free and contextualized embeddings. The word representations are produced by pre-training neural language models on large corpora of biomedical texts. The summarizer models the input text as a graph in which the strength of relations between sentences is measured using the domain specific vector representations. We also assess the usefulness of different graph ranking techniques in the sentence selection step of our summarization method. Using the common Recall-Oriented Understudy for Gisting Evaluation (ROUGE) metrics, we evaluate the performance of our summarizer against various comparison methods. The results show that when the summarizer utilizes proper combinations of context-free and contextualized embeddings, along with an effective ranking method, it can outperform the other methods. We demonstrate that the best settings of our graph-based summarizer can efficiently improve the informative content of summaries and decrease the redundancy.


Assuntos
Processamento de Linguagem Natural , Unified Medical Language System , Idioma , Semântica
17.
Pharmacogenet Genomics ; 30(6): 131-144, 2020 08.
Artigo em Inglês | MEDLINE | ID: mdl-32317559

RESUMO

OBJECTIVES: Pharmacogenetic panel-based testing represents a new model for precision medicine. A sufficiently powered prospective study assessing the (cost-)effectiveness of a panel-based pharmacogenomics approach to guide pharmacotherapy is lacking. Therefore, the Ubiquitous Pharmacogenomics Consortium initiated the PREemptive Pharmacogenomic testing for prevention of Adverse drug Reactions (PREPARE) study. Here, we provide an overview of considerations made to mitigate multiple methodological challenges that emerged during the design. METHODS: An evaluation of considerations made when designing the PREPARE study across six domains: study aims and design, primary endpoint definition and collection of adverse drug events, inclusion and exclusion criteria, target population, pharmacogenomics intervention strategy, and statistical analyses. RESULTS: Challenges and respective solutions included: (1) defining and operationalizing a composite primary endpoint enabling measurement of the anticipated effect, by including only severe, causal, and drug genotype-associated adverse drug reactions; (2) avoiding overrepresentation of frequently prescribed drugs within the patient sample while maintaining external validity, by capping drugs of enrolment; (3) designing the pharmacogenomics intervention strategy to be applicable across ethnicities and healthcare settings; and (4) designing a statistical analysis plan to avoid dilution of effect by initially excluding patients without a gene-drug interaction in a gatekeeping analysis. CONCLUSION: Our design considerations will enable quantification of the collective clinical utility of a panel of pharmacogenomics-markers within one trial as a proof-of-concept for pharmacogenomics-guided pharmacotherapy across multiple actionable gene-drug interactions. These considerations may prove useful to other investigators aiming to generate evidence for precision medicine.


Assuntos
Efeitos Colaterais e Reações Adversas Relacionados a Medicamentos/prevenção & controle , Testes Farmacogenômicos/métodos , Medicina de Precisão/métodos , Efeitos Colaterais e Reações Adversas Relacionados a Medicamentos/genética , Medicina Baseada em Evidências , Humanos , Modelos Estatísticos , Guias de Prática Clínica como Assunto , Estudos Prospectivos
18.
Bioinformatics ; 36(13): 4097-4098, 2020 07 01.
Artigo em Inglês | MEDLINE | ID: mdl-32339214

RESUMO

SUMMARY: Recently, novel machine-learning algorithms have shown potential for predicting undiscovered links in biomedical knowledge networks. However, dedicated benchmarks for measuring algorithmic progress have not yet emerged. With OpenBioLink, we introduce a large-scale, high-quality and highly challenging biomedical link prediction benchmark to transparently and reproducibly evaluate such algorithms. Furthermore, we present preliminary baseline evaluation results. AVAILABILITY AND IMPLEMENTATION: Source code and data are openly available at https://github.com/OpenBioLink/OpenBioLink. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Benchmarking , Software , Algoritmos , Aprendizado de Máquina
19.
Comput Methods Programs Biomed ; 184: 105117, 2020 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-31627150

RESUMO

BACKGROUND AND OBJECTIVE: Capturing the context of text is a challenging task in biomedical text summarization. The objective of this research is to show how contextualized embeddings produced by a deep bidirectional language model can be utilized to quantify the informative content of sentences in biomedical text summarization. METHODS: We propose a novel summarization method that utilizes contextualized embeddings generated by the Bidirectional Encoder Representations from Transformers (BERT) model, a deep learning model that recently demonstrated state-of-the-art results in several natural language processing tasks. We combine different versions of BERT with a clustering method to identify the most relevant and informative sentences of input documents. Using the ROUGE toolkit, we evaluate the summarizer against several methods previously described in literature. RESULTS: The summarizer obtains state-of-the-art results and significantly improves the performance of biomedical text summarization in comparison to a set of domain-specific and domain-independent methods. The largest language model not specifically pretrained on biomedical text outperformed other models. However, among language models of the same size, the one further pretrained on biomedical text obtained best results. CONCLUSIONS: We demonstrate that a hybrid system combining a deep bidirectional language model and a clustering method yields state-of-the-art results without requiring labor-intensive creation of annotated features or knowledge bases or computationally demanding domain-specific pretraining. This study provides a starting point towards investigating deep contextualized language models for biomedical text summarization.


Assuntos
Mineração de Dados/métodos , Informática Médica , Processamento de Linguagem Natural , Algoritmos , Aprendizado Profundo , Humanos , Semântica , Unified Medical Language System
20.
BMC Bioinformatics ; 20(1): 178, 2019 Apr 11.
Artigo em Inglês | MEDLINE | ID: mdl-30975071

RESUMO

BACKGROUND: Neural network based embedding models are receiving significant attention in the field of natural language processing due to their capability to effectively capture semantic information representing words, sentences or even larger text elements in low-dimensional vector space. While current state-of-the-art models for assessing the semantic similarity of textual statements from biomedical publications depend on the availability of laboriously curated ontologies, unsupervised neural embedding models only require large text corpora as input and do not need manual curation. In this study, we investigated the efficacy of current state-of-the-art neural sentence embedding models for semantic similarity estimation of sentences from biomedical literature. We trained different neural embedding models on 1.7 million articles from the PubMed Open Access dataset, and evaluated them based on a biomedical benchmark set containing 100 sentence pairs annotated by human experts and a smaller contradiction subset derived from the original benchmark set. RESULTS: Experimental results showed that, with a Pearson correlation of 0.819, our best unsupervised model based on the Paragraph Vector Distributed Memory algorithm outperforms previous state-of-the-art results achieved on the BIOSSES biomedical benchmark set. Moreover, our proposed supervised model that combines different string-based similarity metrics with a neural embedding model surpasses previous ontology-dependent supervised state-of-the-art approaches in terms of Pearson's r (r = 0.871) on the biomedical benchmark set. In contrast to the promising results for the original benchmark, we found our best models' performance on the smaller contradiction subset to be poor. CONCLUSIONS: In this study, we have highlighted the value of neural network-based models for semantic similarity estimation in the biomedical domain by showing that they can keep up with and even surpass previous state-of-the-art approaches for semantic similarity estimation that depend on the availability of laboriously curated ontologies, when evaluated on a biomedical benchmark set. Capturing contradictions and negations in biomedical sentences, however, emerged as an essential area for further work.


Assuntos
Pesquisa Biomédica , Modelos Teóricos , Semântica , Algoritmos , Humanos , PubMed
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...