Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 4 de 4
Filter
Add more filters










Database
Language
Publication year range
1.
JACC Adv ; 2(1)2023 Jan.
Article in English | MEDLINE | ID: mdl-36875009

ABSTRACT

BACKGROUND: Infants with SVHD experience morbidity related to pulmonary vascular inadequacy. Metabolomic analysis involves a systems biology approach to identifying novel biomarkers and pathways in complex diseases. The metabolome of infants with SVHD is not well understood and no prior study has evaluated the relationship between serum metabolite patterns and pulmonary vascular readiness for staged SVHD palliation. OBJECTIVES: The purpose of this study was to evaluate the circulating metabolome of interstage infants with single ventricle heart disease (SVHD) and determine whether metabolite levels were associated with pulmonary vascular inadequacy. METHODS: This was a prospective cohort study of 52 infants with SVHD undergoing Stage 2 palliation and 48 healthy infants. Targeted metabolomic phenotyping (175 metabolites) was performed by tandem mass spectrometry on SVHD pre-Stage 2, post-Stage 2, and control serum samples. Clinical variables were extracted from the medical record. RESULTS: Random forest analysis readily distinguished between cases and controls and preoperative and postoperative samples. Seventy-four of 175 metabolites differed between SVHD and controls. Twenty-seven of 39 metabolic pathways were altered including pentose phosphate and arginine metabolism. Seventy-one metabolites differed in SVHD patients between timepoints. Thirty-three of 39 pathways were altered postoperatively including arginine and tryptophan metabolism. We found trends toward increased preoperative methionine metabolites in patients with higher pulmonary vascular resistance and higher postoperative tryptophan metabolites in patients with greater postoperative hypoxemia. CONCLUSIONS: The circulating metabolome of interstage SVHD infants differs significantly from controls and is further disrupted after Stage 2. Several metabolites showed trends toward association with adverse outcomes. Metabolic dysregulation may be an important factor in early SVHD pathobiology.

2.
BMC Bioinformatics ; 23(1): 179, 2022 May 16.
Article in English | MEDLINE | ID: mdl-35578165

ABSTRACT

When analyzing large datasets from high-throughput technologies, researchers often encounter missing quantitative measurements, which are particularly frequent in metabolomics datasets. Metabolomics, the comprehensive profiling of metabolite abundances, are typically measured using mass spectrometry technologies that often introduce missingness via multiple mechanisms: (1) the metabolite signal may be smaller than the instrument limit of detection; (2) the conditions under which the data are collected and processed may lead to missing values; (3) missing values can be introduced randomly. Missingness resulting from mechanism (1) would be classified as Missing Not At Random (MNAR), that from mechanism (2) would be Missing At Random (MAR), and that from mechanism (3) would be classified as Missing Completely At Random (MCAR). Two common approaches for handling missing data are the following: (1) omit missing data from the analysis; (2) impute the missing values. Both approaches may introduce bias and reduce statistical power in downstream analyses such as testing metabolite associations with clinical variables. Further, standard imputation methods in metabolomics often ignore the mechanisms causing missingness and inaccurately estimate missing values within a data set. We propose a mechanism-aware imputation algorithm that leverages a two-step approach in imputing missing values. First, we use a random forest classifier to classify the missing mechanism for each missing value in the data set. Second, we impute each missing value using imputation algorithms that are specific to the predicted missingness mechanism (i.e., MAR/MCAR or MNAR). Using complete data, we conducted simulations, where we imposed different missingness patterns within the data and tested the performance of combinations of imputation algorithms. Our proposed algorithm provided imputations closer to the original data than those using only one imputation algorithm for all the missing values. Consequently, our two-step approach was able to reduce bias for improved downstream analyses.


Subject(s)
Algorithms , Metabolomics , Bias , Mass Spectrometry/methods , Metabolomics/methods
3.
Lancet Digit Health ; 4(7): e532-e541, 2022 07.
Article in English | MEDLINE | ID: mdl-35589549

ABSTRACT

BACKGROUND: Post-acute sequelae of SARS-CoV-2 infection, known as long COVID, have severely affected recovery from the COVID-19 pandemic for patients and society alike. Long COVID is characterised by evolving, heterogeneous symptoms, making it challenging to derive an unambiguous definition. Studies of electronic health records are a crucial element of the US National Institutes of Health's RECOVER Initiative, which is addressing the urgent need to understand long COVID, identify treatments, and accurately identify who has it-the latter is the aim of this study. METHODS: Using the National COVID Cohort Collaborative's (N3C) electronic health record repository, we developed XGBoost machine learning models to identify potential patients with long COVID. We defined our base population (n=1 793 604) as any non-deceased adult patient (age ≥18 years) with either an International Classification of Diseases-10-Clinical Modification COVID-19 diagnosis code (U07.1) from an inpatient or emergency visit, or a positive SARS-CoV-2 PCR or antigen test, and for whom at least 90 days have passed since COVID-19 index date. We examined demographics, health-care utilisation, diagnoses, and medications for 97 995 adults with COVID-19. We used data on these features and 597 patients from a long COVID clinic to train three machine learning models to identify potential long COVID among all patients with COVID-19, patients hospitalised with COVID-19, and patients who had COVID-19 but were not hospitalised. Feature importance was determined via Shapley values. We further validated the models on data from a fourth site. FINDINGS: Our models identified, with high accuracy, patients who potentially have long COVID, achieving areas under the receiver operator characteristic curve of 0·92 (all patients), 0·90 (hospitalised), and 0·85 (non-hospitalised). Important features, as defined by Shapley values, include rate of health-care utilisation, patient age, dyspnoea, and other diagnosis and medication information available within the electronic health record. INTERPRETATION: Patients identified by our models as potentially having long COVID can be interpreted as patients warranting care at a specialty clinic for long COVID, which is an essential proxy for long COVID diagnosis as its definition continues to evolve. We also achieve the urgent goal of identifying potential long COVID in patients for clinical trials. As more data sources are identified, our models can be retrained and tuned based on the needs of individual studies. FUNDING: US National Institutes of Health and National Center for Advancing Translational Sciences through the RECOVER Initiative.


Subject(s)
COVID-19 , Adolescent , Adult , COVID-19/complications , COVID-19/diagnosis , COVID-19/epidemiology , COVID-19 Testing , Humans , Machine Learning , Pandemics , SARS-CoV-2 , United States/epidemiology , Post-Acute COVID-19 Syndrome
4.
Metabolites ; 11(10)2021 Oct 02.
Article in English | MEDLINE | ID: mdl-34677393

ABSTRACT

The bottleneck for taking full advantage of metabolomics data is often the availability, awareness, and usability of analysis tools. Software tools specifically designed for metabolomics data are being developed at an increasing rate, with hundreds of available tools already in the literature. Many of these tools are open-source and freely available but are very diverse with respect to language, data formats, and stages in the metabolomics pipeline. To help mitigate the challenges of meeting the increasing demand for guidance in choosing analytical tools and coordinating the adoption of best practices for reproducibility, we have designed and built the MSCAT (Metabolomics Software CATalog) database of metabolomics software tools that can be sustainably and continuously updated. This database provides a survey of the landscape of available tools and can assist researchers in their selection of data analysis workflows for metabolomics studies according to their specific needs. We used machine learning (ML) methodology for the purpose of semi-automating the identification of metabolomics software tool names within abstracts. MSCAT searches the literature to find new software tools by implementing a Named Entity Recognition (NER) model based on a neural network model at the sentence level composed of a character-level convolutional neural network (CNN) combined with a bidirectional long-short-term memory (LSTM) layer and a conditional random fields (CRF) layer. The list of potential new tools (and their associated publication) is then forwarded to the database maintainer for the curation of the database entry corresponding to the tool. The end-user interface allows for filtering of tools by multiple characteristics as well as plotting of the aggregate tool data to monitor the metabolomics software landscape.

SELECTION OF CITATIONS
SEARCH DETAIL
...