Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 12 de 12
Filter
1.
Sci Adv ; 10(18): eadk3452, 2024 May 03.
Article in English | MEDLINE | ID: mdl-38691601

ABSTRACT

Machine learning (ML) methods are proliferating in scientific research. However, the adoption of these methods has been accompanied by failures of validity, reproducibility, and generalizability. These failures can hinder scientific progress, lead to false consensus around invalid claims, and undermine the credibility of ML-based science. ML methods are often applied and fail in similar ways across disciplines. Motivated by this observation, our goal is to provide clear recommendations for conducting and reporting ML-based science. Drawing from an extensive review of past literature, we present the REFORMS checklist (recommendations for machine-learning-based science). It consists of 32 questions and a paired set of guidelines. REFORMS was developed on the basis of a consensus of 19 researchers across computer science, data science, mathematics, social sciences, and biomedical sciences. REFORMS can serve as a resource for researchers when designing and implementing a study, for referees when reviewing papers, and for journals when enforcing standards for transparency and reproducibility.


Subject(s)
Consensus , Machine Learning , Humans , Reproducibility of Results , Science
2.
Nat Mach Intell ; 4(12): 1174-1184, 2022.
Article in English | MEDLINE | ID: mdl-36567960

ABSTRACT

Medicines based on messenger RNA (mRNA) hold immense potential, as evidenced by their rapid deployment as COVID-19 vaccines. However, worldwide distribution of mRNA molecules has been limited by their thermostability, which is fundamentally limited by the intrinsic instability of RNA molecules to a chemical degradation reaction called in-line hydrolysis. Predicting the degradation of an RNA molecule is a key task in designing more stable RNA-based therapeutics. Here, we describe a crowdsourced machine learning competition ('Stanford OpenVaccine') on Kaggle, involving single-nucleotide resolution measurements on 6,043 diverse 102-130-nucleotide RNA constructs that were themselves solicited through crowdsourcing on the RNA design platform Eterna. The entire experiment was completed in less than 6 months, and 41% of nucleotide-level predictions from the winning model were within experimental error of the ground truth measurement. Furthermore, these models generalized to blindly predicting orthogonal degradation data on much longer mRNA molecules (504-1,588 nucleotides) with improved accuracy compared with previously published models. These results indicate that such models can represent in-line hydrolysis with excellent accuracy, supporting their use for designing stabilized messenger RNAs. The integration of two crowdsourcing platforms, one for dataset creation and another for machine learning, may be fruitful for other urgent problems that demand scientific discovery on rapid timescales.

3.
Arch Gynecol Obstet ; 306(2): 571-575, 2022 08.
Article in English | MEDLINE | ID: mdl-35106643

ABSTRACT

PURPOSE: In this correspondence, we highlight general and domain-specific caveats in the development and validation of prediction models. METHODS: Development and use of the "QUiPP" application, a tool for preterm birth prediction which is supported by the United Kingdom National Health Service, is scrutinised and commented on. RESULTS: We highlight and elaborate ten points which may be perceived to be unclear or potentially misleading. CONCLUSION: While the QUiPP application has high potential, it lacks transparency (on certain aspects related to model development) and proper validation. This precludes transportability to settings with other treatment policies and to other countries where the app has been made publicly available.


Subject(s)
Premature Birth , Cervical Length Measurement , Cervix Uteri/diagnostic imaging , Female , Fibronectins , Humans , Infant, Newborn , Internet , Predictive Value of Tests , Pregnancy , Prospective Studies , State Medicine
4.
ArXiv ; 2021 Oct 14.
Article in English | MEDLINE | ID: mdl-34671698

ABSTRACT

Messenger RNA-based medicines hold immense potential, as evidenced by their rapid deployment as COVID-19 vaccines. However, worldwide distribution of mRNA molecules has been limited by their thermostability, which is fundamentally limited by the intrinsic instability of RNA molecules to a chemical degradation reaction called in-line hydrolysis. Predicting the degradation of an RNA molecule is a key task in designing more stable RNA-based therapeutics. Here, we describe a crowdsourced machine learning competition ("Stanford OpenVaccine") on Kaggle, involving single-nucleotide resolution measurements on 6043 102-130-nucleotide diverse RNA constructs that were themselves solicited through crowdsourcing on the RNA design platform Eterna. The entire experiment was completed in less than 6 months, and 41% of nucleotide-level predictions from the winning model were within experimental error of the ground truth measurement. Furthermore, these models generalized to blindly predicting orthogonal degradation data on much longer mRNA molecules (504-1588 nucleotides) with improved accuracy compared to previously published models. Top teams integrated natural language processing architectures and data augmentation techniques with predictions from previous dynamic programming models for RNA secondary structure. These results indicate that such models are capable of representing in-line hydrolysis with excellent accuracy, supporting their use for designing stabilized messenger RNAs. The integration of two crowdsourcing platforms, one for data set creation and another for machine learning, may be fruitful for other urgent problems that demand scientific discovery on rapid timescales.

5.
Sensors (Basel) ; 21(4)2021 Feb 04.
Article in English | MEDLINE | ID: mdl-33557169

ABSTRACT

In the time series classification domain, shapelets are subsequences that are discriminative of a certain class. It has been shown that classifiers are able to achieve state-of-the-art results by taking the distances from the input time series to different discriminative shapelets as the input. Additionally, these shapelets can be visualized and thus possess an interpretable characteristic, making them appealing in critical domains, where longitudinal data are ubiquitous. In this study, a new paradigm for shapelet discovery is proposed, which is based on evolutionary computation. The advantages of the proposed approach are that: (i) it is gradient-free, which could allow escaping from local optima more easily and supports non-differentiable objectives; (ii) no brute-force search is required, making the algorithm scalable; (iii) the total amount of shapelets and the length of each of these shapelets are evolved jointly with the shapelets themselves, alleviating the need to specify this beforehand; (iv) entire sets are evaluated at once as opposed to single shapelets, which results in smaller final sets with fewer similar shapelets that result in similar predictive performances; and (v) the discovered shapelets do not need to be a subsequence of the input time series. We present the results of the experiments, which validate the enumerated advantages.

6.
Artif Intell Med ; 111: 101987, 2021 01.
Article in English | MEDLINE | ID: mdl-33461687

ABSTRACT

Information extracted from electrohysterography recordings could potentially prove to be an interesting additional source of information to estimate the risk on preterm birth. Recently, a large number of studies have reported near-perfect results to distinguish between recordings of patients that will deliver term or preterm using a public resource, called the Term/Preterm Electrohysterogram database. However, we argue that these results are overly optimistic due to a methodological flaw being made. In this work, we focus on one specific type of methodological flaw: applying over-sampling before partitioning the data into mutually exclusive training and testing sets. We show how this causes the results to be biased using two artificial datasets and reproduce results of studies in which this flaw was identified. Moreover, we evaluate the actual impact of over-sampling on predictive performance, when applied prior to data partitioning, using the same methodologies of related studies, to provide a realistic view of these methodologies' generalization capabilities. We make our research reproducible by providing all the code under an open license.


Subject(s)
Premature Birth , Databases, Factual , Female , Humans , Infant, Newborn , Pregnancy
7.
J Strength Cond Res ; 35(12): 3500-3505, 2021 Dec 01.
Article in English | MEDLINE | ID: mdl-31498226

ABSTRACT

ABSTRACT: Vermeire, KM, Vandewiele, G, Caen, K, Lievens, M, Bourgois, JG, and Boone, J. Training progression in recreational cyclists: no linear dose-response relationship with training load. J Strength Cond Res 35(12): 3500-3505, 2021-The purpose of the study was to assess the relationship between training load (TL) and performance improvement in a homogeneous group of recreational cyclists, training with a self-oriented training plan. Training data from 11 recreational cyclists were collected over a 12-week period. Before and after the training period, subjects underwent a laboratory incremental exercise test with blood lactate measurements to determine the power output associated with the aerobic threshold (PAT) and the anaerobic threshold (PANT), and the maximal power output (PMAX) was also determined. Mean weekly TL (calculated using the training impulse (TRIMP) of Banister, Edwards TRIMP, Lucia TRIMP and the individualized TRIMP) were correlated to the progression in fitness parameters using Pearson Correlation. Training intensity distribution (TID) was also determined (% in zone 1 as ANT). No significant correlations between mean weekly TRIMP values and the improvement on PMAX (r = -0.22 to 0.08), PANT (r = -0.56 to -0.31) and PAT (r = -0.08 to 0.41) were found. The TID was significant in a multiple regression with PANT as dependent variable (y = 0.0088 + 0.1094 × Z1 - 0.2704 × Z2 + 1.0416 × Z3; p = 0.02; R2 = 0.62). In conclusion, this study shows that the commonly used TRIMP methods to quantify TL do not show a linear dose-response relationship with performance improvement in recreational cyclists. Furthermore, the study shows that TID might be a key factor to establish a relationship with performance improvement.


Subject(s)
Anaerobic Threshold , Physical Exertion , Exercise , Exercise Test , Heart Rate , Humans
8.
BMC Med Inform Decis Mak ; 20(Suppl 4): 191, 2020 12 14.
Article in English | MEDLINE | ID: mdl-33317504

ABSTRACT

BACKGROUND: Leveraging graphs for machine learning tasks can result in more expressive power as extra information is added to the data by explicitly encoding relations between entities. Knowledge graphs are multi-relational, directed graph representations of domain knowledge. Recently, deep learning-based techniques have been gaining a lot of popularity. They can directly process these type of graphs or learn a low-dimensional numerical representation. While it has been shown empirically that these techniques achieve excellent predictive performances, they lack interpretability. This is of vital importance in applications situated in critical domains, such as health care. METHODS: We present a technique that mines interpretable walks from knowledge graphs that are very informative for a certain classification problem. The walks themselves are of a specific format to allow for the creation of data structures that result in very efficient mining. We combine this mining algorithm with three different approaches in order to classify nodes within a graph. Each of these approaches excels on different dimensions such as explainability, predictive performance and computational runtime. RESULTS: We compare our techniques to well-known state-of-the-art black-box alternatives on four benchmark knowledge graph data sets. Results show that our three presented approaches in combination with the proposed mining algorithm are at least competitive to the black-box alternatives, even often outperforming them, while being interpretable. CONCLUSIONS: The mining of walks is an interesting alternative for node classification in knowledge graphs. Opposed to the current state-of-the-art that uses deep learning techniques, it results in inherently interpretable or transparent models without a sacrifice in terms of predictive performance.


Subject(s)
Algorithms , Pattern Recognition, Automated , Humans , Knowledge , Machine Learning
9.
J Biomed Inform ; 110: 103544, 2020 10.
Article in English | MEDLINE | ID: mdl-32858168

ABSTRACT

This paper contributes to the pursuit of leveraging unstructured medical notes to structured clinical decision making. In particular, we present a pipeline for clinical information extraction from medical notes related to preterm birth, and discuss the main challenges as well as its potential for clinical practice. A large collection of medical notes, created by staff during hospitalizations of patients who were at risk of delivering preterm, was gathered and analyzed. Based on an annotated collection of notes, we trained and evaluated information extraction components to discover clinical entities such as symptoms, events, anatomical sites and procedures, as well as attributes linked to these clinical entities. In a retrospective study, we show that these are highly informative for clinical decision support models that are trained to predict whether delivery is likely to occur within specific time windows, in combination with structured information from electronic health records.


Subject(s)
Premature Birth , Data Mining , Electronic Health Records , Female , Humans , Infant, Newborn , Pregnancy , Premature Birth/epidemiology , Retrospective Studies
10.
J Med Internet Res ; 21(6): e11934, 2019 06 07.
Article in English | MEDLINE | ID: mdl-31237838

ABSTRACT

BACKGROUND: Mobile apps generate vast amounts of user data. In the mobile health (mHealth) domain, researchers are increasingly discovering the opportunities of log data to assess the usage of their mobile apps. To date, however, the analysis of these data are often limited to descriptive statistics. Using data mining techniques, log data can offer significantly deeper insights. OBJECTIVE: The purpose of this study was to assess how Markov Chain and sequence clustering analysis can be used to find meaningful usage patterns of mHealth apps. METHODS: Using the data of a 25-day field trial (n=22) of the Start2Cycle app, an app developed to encourage recreational cycling in adults, a transition matrix between the different pages of the app was composed. From this matrix, a Markov Chain was constructed, enabling intuitive user behavior analysis. RESULTS: Through visual inspection of the transitions, 3 types of app use could be distinguished (route tracking, gamification, and bug reporting). Markov Chain-based sequence clustering was subsequently used to demonstrate how clusters of session types can otherwise be obtained. CONCLUSIONS: Using Markov Chains to assess in-app navigation presents a sound method to evaluate use of mHealth interventions. The insights can be used to evaluate app use and improve user experience.


Subject(s)
Data Mining/methods , Markov Chains , Mobile Applications/statistics & numerical data , Telemedicine/methods , Female , Humans , Male , Middle Aged
11.
Int J Sports Physiol Perform ; 14(6): 841­846, 2019 07 01.
Article in English | MEDLINE | ID: mdl-30569767

ABSTRACT

PURPOSE: To predict the session rating of perceived exertion (sRPE) in soccer and determine its main predictive indicators. METHODS: A total of 70 external-load indicators (ELIs), internal-load indicators, individual characteristics, and supplementary variables were used to build a predictive model. RESULTS: The analysis using gradient-boosting machines showed a mean absolute error of 0.67 (0.09) arbitrary units (AU) and a root-mean-square error of 0.93 (0.16) AU. ELIs were found to be the strongest predictors of the sRPE, accounting for 61.5% of the total normalized importance (NI), with total distance as the strongest predictor. The included internal-load indicators and individual characteristics accounted only for 1.0% and 4.5%, respectively, of the total NI. Predictive accuracy improved when including supplementary variables such as group-based sRPE predictions (10.5% of NI), individual deviation variables (5.8% of NI), and individual player markers (17.0% of NI). CONCLUSIONS: The results showed that the sRPE can be predicted quite accurately using only a relatively limited number of training observations. ELIs are the strongest predictors of the sRPE. However, it is useful to include a broad range of variables other than ELIs, because the accumulated importance of these variables accounts for a reasonable component of the total NI. Applications resulting from predictive modeling of the sRPE can help coaching staff plan, monitor, and evaluate both the external and internal training load.


Subject(s)
Physical Exertion , Soccer , Workload , Adult , Humans , Models, Theoretical , Physical Conditioning, Human , Young Adult
12.
BMC Med Inform Decis Mak ; 18(1): 98, 2018 11 13.
Article in English | MEDLINE | ID: mdl-30424769

ABSTRACT

BACKGROUND: Headache disorders are an important health burden, having a large health-economic impact worldwide. Current treatment & follow-up processes are often archaic, creating opportunities for computer-aided and decision support systems to increase their efficiency. Existing systems are mostly completely data-driven, and the underlying models are a black-box, deteriorating interpretability and transparency, which are key factors in order to be deployed in a clinical setting. METHODS: In this paper, a decision support system is proposed, composed of three components: (i) a cross-platform mobile application to capture the required data from patients to formulate a diagnosis, (ii) an automated diagnosis support module that generates an interpretable decision tree, based on data semantically annotated with expert knowledge, in order to support physicians in formulating the correct diagnosis and (iii) a web application such that the physician can efficiently interpret captured data and learned insights by means of visualizations. RESULTS: We show that decision tree induction techniques achieve competitive accuracy rates, compared to other black- and white-box techniques, on a publicly available dataset, referred to as migbase. Migbase contains aggregated information of headache attacks from 849 patients. Each sample is labeled with one of three possible primary headache disorders. We demonstrate that we are able to reduce the classification error, statistically significant (ρ≤0.05), with more than 10% by balancing the dataset using prior expert knowledge. Furthermore, we achieve high accuracy rates by using features extracted using the Weisfeiler-Lehman kernel, which is completely unsupervised. This makes it an ideal approach to solve a potential cold start problem. CONCLUSION: Decision trees are the perfect candidate for the automated diagnosis support module. They achieve predictive performances competitive to other techniques on the migbase dataset and are, foremost, completely interpretable. Moreover, the incorporation of prior knowledge increases both predictive performance as well as transparency of the resulting predictive model on the studied dataset.


Subject(s)
Decision Support Systems, Clinical , Headache Disorders/diagnosis , Decision Trees , Expert Systems , Follow-Up Studies , Humans , Software
SELECTION OF CITATIONS
SEARCH DETAIL
...