Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 131
Filter
1.
Orphanet J Rare Dis ; 19(1): 183, 2024 May 02.
Article in English | MEDLINE | ID: mdl-38698482

ABSTRACT

BACKGROUND: With over 7000 Mendelian disorders, identifying children with a specific rare genetic disorder diagnosis through structured electronic medical record data is challenging given incompleteness of records, inaccurate medical diagnosis coding, as well as heterogeneity in clinical symptoms and procedures for specific disorders. We sought to develop a digital phenotyping algorithm (PheIndex) using electronic medical records to identify children aged 0-3 diagnosed with genetic disorders or who present with illness with an increased risk for genetic disorders. RESULTS: Through expert opinion, we established 13 criteria for the algorithm and derived a score and a classification. The performance of each criterion and the classification were validated by chart review. PheIndex identified 1,088 children out of 93,154 live births who may be at an increased risk for genetic disorders. Chart review demonstrated that the algorithm achieved 90% sensitivity, 97% specificity, and 94% accuracy. CONCLUSIONS: The PheIndex algorithm can help identify when a rare genetic disorder may be present, alerting providers to consider ordering a diagnostic genetic test and/or referring a patient to a medical geneticist.


Subject(s)
Algorithms , Rare Diseases , Humans , Rare Diseases/genetics , Rare Diseases/diagnosis , Infant , Infant, Newborn , Child, Preschool , Female , Male , Electronic Health Records , Genetic Diseases, Inborn/diagnosis , Genetic Diseases, Inborn/genetics , Phenotype
2.
iScience ; 27(3): 108905, 2024 Mar 15.
Article in English | MEDLINE | ID: mdl-38390492

ABSTRACT

Characterizing the effect of combination therapies is vital for treating diseases like cancer. We introduce correlated drug action (CDA), a baseline model for the study of drug combinations in both cell cultures and patient populations, which assumes that the efficacy of drugs in a combination may be correlated. We apply temporal CDA (tCDA) to clinical trial data, and demonstrate the utility of this approach in identifying possible synergistic combinations and others that can be explained in terms of monotherapies. Using MCF7 cell line data, we assess combinations with dose CDA (dCDA), a model that generalizes other proposed models (e.g., Bliss response-additivity, the dose equivalence principle), and introduce Excess over CDA (EOCDA), a new metric for identifying possible synergistic combinations in cell culture.

3.
Cell Rep Med ; 5(1): 101350, 2024 01 16.
Article in English | MEDLINE | ID: mdl-38134931

ABSTRACT

Every year, 11% of infants are born preterm with significant health consequences, with the vaginal microbiome a risk factor for preterm birth. We crowdsource models to predict (1) preterm birth (PTB; <37 weeks) or (2) early preterm birth (ePTB; <32 weeks) from 9 vaginal microbiome studies representing 3,578 samples from 1,268 pregnant individuals, aggregated from public raw data via phylogenetic harmonization. The predictive models are validated on two independent unpublished datasets representing 331 samples from 148 pregnant individuals. The top-performing models (among 148 and 121 submissions from 318 teams) achieve area under the receiver operator characteristic (AUROC) curve scores of 0.69 and 0.87 predicting PTB and ePTB, respectively. Alpha diversity, VALENCIA community state types, and composition are important features in the top-performing models, most of which are tree-based methods. This work is a model for translation of microbiome data into clinically relevant predictive models and to better understand preterm birth.


Subject(s)
Crowdsourcing , Microbiota , Premature Birth , Pregnancy , Female , Infant, Newborn , Humans , Phylogeny , Vagina , Microbiota/genetics
4.
J Thorac Dis ; 15(5): 2438-2449, 2023 May 30.
Article in English | MEDLINE | ID: mdl-37324065

ABSTRACT

Background: Although optimal sequencing of systemic therapy in cancer care is critical to achieving maximal clinical benefit, there is a lack of analysis of treatment sequencing in advanced non-small cell lung cancer (aNSCLC) in real-world settings. Methods: A retrospective cohort study of 13,340 lung cancer patients within the Mount Sinai Health System (MSHS) was performed. Systemic therapy data of aNSCLC in 2,106 patients was the starting point in our analysis to investigate how treatment sequencing has evolved, the impact of sequencing patterns on clinical outcomes, and the effectiveness of 2nd line chemotherapy after patients progressed on immune checkpoint inhibitor (ICI)-based therapy as the 1st line of therapy (LOT). Results: There is a significant shift to more ICI-based therapy and multiple lines of targeted therapy after 2015. We compared clinical outcomes of two patient populations with different treatment sequencing patterns, with the 1st group receiving chemotherapy as the 1st LOT followed by ICI-based treatment, and the 2nd group treated in the opposite order receiving a 1st line ICI-containing regimen followed by a 2nd line chemotherapy. No statistically significant difference in overall survival (OS) was observed between the two groups [group 2 vs. group 1, adjusted hazard ratio (aHR) =1.36, P=0.39]. We assessed the efficacy of the 2nd line chemotherapy in three patient populations given either 1st line ICI single agent, 1st line ICI-chemotherapy combination, or 1st line chemotherapy alone, there was no statistically significant difference in time-to-next treatment (TTNT) and in OS among the three patient groups. Conclusions: Analysis of real-world data has shown two treatment sequencing patterns in aNSCLC, ICI followed by chemotherapy or chemotherapy followed by ICI, achieved similar clinical benefit. The chemotherapies routinely used following platinum doublet 1st LOT, is effective as the 2nd line option after ICI-chemotherapy combination in the 1st line setting.

5.
medRxiv ; 2023 Apr 11.
Article in English | MEDLINE | ID: mdl-36945505

ABSTRACT

Globally, every year about 11% of infants are born preterm, defined as a birth prior to 37 weeks of gestation, with significant and lingering health consequences. Multiple studies have related the vaginal microbiome to preterm birth. We present a crowdsourcing approach to predict: (a) preterm or (b) early preterm birth from 9 publicly available vaginal microbiome studies representing 3,578 samples from 1,268 pregnant individuals, aggregated from raw sequences via an open-source tool, MaLiAmPi. We validated the crowdsourced models on novel datasets representing 331 samples from 148 pregnant individuals. From 318 DREAM challenge participants we received 148 and 121 submissions for our two separate prediction sub-challenges with top-ranking submissions achieving bootstrapped AUROC scores of 0.69 and 0.87, respectively. Alpha diversity, VALENCIA community state types, and composition (via phylotype relative abundance) were important features in the top performing models, most of which were tree based methods. This work serves as the foundation for subsequent efforts to translate predictive tests into clinical practice, and to better understand and prevent preterm birth.

6.
JAMA Netw Open ; 5(8): e2227423, 2022 08 01.
Article in English | MEDLINE | ID: mdl-36036935

ABSTRACT

Importance: An automated, accurate method is needed for unbiased assessment quantifying accrual of joint space narrowing and erosions on radiographic images of the hands and wrists, and feet for clinical trials, monitoring of joint damage over time, assisting rheumatologists with treatment decisions. Such a method has the potential to be directly integrated into electronic health records. Objectives: To design and implement an international crowdsourcing competition to catalyze the development of machine learning methods to quantify radiographic damage in rheumatoid arthritis (RA). Design, Setting, and Participants: This diagnostic/prognostic study describes the Rheumatoid Arthritis 2-Dialogue for Reverse Engineering Assessment and Methods (RA2-DREAM Challenge), which used existing radiographic images and expert-curated Sharp-van der Heijde (SvH) scores from 2 clinical studies (674 radiographic sets from 562 patients) for training (367 sets), leaderboard (119 sets), and final evaluation (188 sets). Challenge participants were tasked with developing methods to automatically quantify overall damage (subchallenge 1), joint space narrowing (subchallenge 2), and erosions (subchallenge 3). The challenge was finished on June 30, 2020. Main Outcomes and Measures: Scores derived from submitted algorithms were compared with the expert-curated SvH scores, and a baseline model was created for benchmark comparison. Performances were ranked using weighted root mean square error (RMSE). The performance and reproductivity of each algorithm was assessed using Bayes factor from bootstrapped data, and further evaluated with a postchallenge independent validation data set. Results: The RA2-DREAM Challenge received a total of 173 submissions from 26 participants or teams in 7 countries for the leaderboard round, and 13 submissions were included in the final evaluation. The weighted RMSEs metric showed that the winning algorithms produced scores that were very close to the expert-curated SvH scores. Top teams included Team Shirin for subchallenge 1 (weighted RMSE, 0.44), HYL-YFG (Hongyang Li and Yuanfang Guan) subchallenge 2 (weighted RMSE, 0.38), and Gold Therapy for subchallenge 3 (weighted RMSE, 0.43). Bootstrapping/Bayes factor approach and the postchallenge independent validation confirmed the reproducibility and the estimation concordance indices between final evaluation and postchallenge independent validation data set were 0.71 for subchallenge 1, 0.78 for subchallenge 2, and 0.82 for subchallenge 3. Conclusions and Relevance: The RA2-DREAM Challenge resulted in the development of algorithms that provide feasible, quick, and accurate methods to quantify joint damage in RA. Ultimately, these methods could help research studies on RA joint damage and may be integrated into electronic health records to help clinicians serve patients better by providing timely, reliable, and quantitative information for making treatment decisions to prevent further damage.


Subject(s)
Arthritis, Rheumatoid , Crowdsourcing , Arthritis, Rheumatoid/diagnostic imaging , Arthritis, Rheumatoid/drug therapy , Bayes Theorem , Humans , Machine Learning , Reproducibility of Results
7.
iScience ; 25(6): 104414, 2022 Jun 17.
Article in English | MEDLINE | ID: mdl-35663013

ABSTRACT

Circulating extracellular vesicles (EVs) contain molecular footprints-lipids, proteins, RNA, and DNA-from their cell of origin. Consequently, EV-associated RNA and proteins have gained widespread interest as liquid-biopsy biomarkers. Yet, an integrative proteo-transcriptomic landscape of EVs and comparison with their cell of origin remains obscure. Here, we report that EVs enrich distinct proteo-transcriptome that does not linearly correlate with their cell of origin. We show that EVs enrich endosomal and extracellular proteins, small RNA (∼13-200 nucleotides) associated with cell differentiation, development, and Wnt signaling. EVs cargo specific RNAs (RNY3, vtRNA, and MIRLET-7) and their complementary proteins (YBX1, IGF2BP2, and SRSF1/2). To ensure an unbiased and independent analyses, we studied 12 cancer cell lines, matching EVs (inhouse and exRNA database), and serum EVs of patients with prostate cancer. Together, we show that EV-RNA-protein complexes may constitute a functional interaction network to protect and regulate molecular access until a function is achieved.

8.
Proc Natl Acad Sci U S A ; 118(34)2021 08 24.
Article in English | MEDLINE | ID: mdl-34413191

ABSTRACT

Binary classification is one of the central problems in machine-learning research and, as such, investigations of its general statistical properties are of interest. We studied the ranking statistics of items in binary classification problems and observed that there is a formal and surprising relationship between the probability of a sample belonging to one of the two classes and the Fermi-Dirac distribution determining the probability that a fermion occupies a given single-particle quantum state in a physical system of noninteracting fermions. Using this equivalence, it is possible to compute a calibrated probabilistic output for binary classifiers. We show that the area under the receiver operating characteristics curve (AUC) in a classification problem is related to the temperature of an equivalent physical system. In a similar manner, the optimal decision threshold between the two classes is associated with the chemical potential of an equivalent physical system. Using our framework, we also derive a closed-form expression to calculate the variance for the AUC of a classifier. Finally, we introduce FiDEL (Fermi-Dirac-based ensemble learning), an ensemble learning algorithm that uses the calibrated nature of the classifier's output probability to combine possibly very different classifiers.

9.
Cell Rep Med ; 2(6): 100323, 2021 06 15.
Article in English | MEDLINE | ID: mdl-34195686

ABSTRACT

Identification of pregnancies at risk of preterm birth (PTB), the leading cause of newborn deaths, remains challenging given the syndromic nature of the disease. We report a longitudinal multi-omics study coupled with a DREAM challenge to develop predictive models of PTB. The findings indicate that whole-blood gene expression predicts ultrasound-based gestational ages in normal and complicated pregnancies (r = 0.83) and, using data collected before 37 weeks of gestation, also predicts the delivery date in both normal pregnancies (r = 0.86) and those with spontaneous preterm birth (r = 0.75). Based on samples collected before 33 weeks in asymptomatic women, our analysis suggests that expression changes preceding preterm prelabor rupture of the membranes are consistent across time points and cohorts and involve leukocyte-mediated immunity. Models built from plasma proteomic data predict spontaneous preterm delivery with intact membranes with higher accuracy and earlier in pregnancy than transcriptomic models (AUROC = 0.76 versus AUROC = 0.6 at 27-33 weeks of gestation).


Subject(s)
Blood Proteins/genetics , Cell-Free Nucleic Acids/genetics , Gestational Age , Pre-Eclampsia/genetics , Premature Birth/genetics , Transcriptome , Adult , Asymptomatic Diseases , Biomarkers/blood , Blood Proteins/classification , Blood Proteins/metabolism , Cell-Free Nucleic Acids/blood , Cell-Free Nucleic Acids/classification , Crowdsourcing/methods , Female , Humans , Infant, Newborn , Longitudinal Studies , Pre-Eclampsia/blood , Pre-Eclampsia/diagnosis , Pregnancy , Premature Birth/blood , Premature Birth/diagnosis , Proteomics/methods , ROC Curve
10.
Gut ; 2021 Jul 28.
Article in English | MEDLINE | ID: mdl-34321221

ABSTRACT

OBJECTIVE: Surveillance tools for early cancer detection are suboptimal, including hepatocellular carcinoma (HCC), and biomarkers are urgently needed. Extracellular vesicles (EVs) have gained increasing scientific interest due to their involvement in tumour initiation and metastasis; however, most extracellular RNA (exRNA) blood-based biomarker studies are limited to annotated genomic regions. DESIGN: EVs were isolated with differential ultracentrifugation and integrated nanoscale deterministic lateral displacement arrays (nanoDLD) and quality assessed by electron microscopy, immunoblotting, nanoparticle tracking and deconvolution analysis. Genome-wide sequencing of the largely unexplored small exRNA landscape, including unannotated transcripts, identified and reproducibly quantified small RNA clusters (smRCs). Their key genomic features were delineated across biospecimens and EV isolation techniques in prostate cancer and HCC. Three independent exRNA cancer datasets with a total of 479 samples from 375 patients, including longitudinal samples, were used for this study. RESULTS: ExRNA smRCs were dominated by uncharacterised, unannotated small RNA with a consensus sequence of 20 nt. An unannotated 3-smRC signature was significantly overexpressed in plasma exRNA of patients with HCC (p<0.01, n=157). An independent validation in a phase 2 biomarker case-control study revealed 86% sensitivity and 91% specificity for the detection of early HCC from controls at risk (n=209) (area under the receiver operating curve (AUC): 0.87). The 3-smRC signature was independent of alpha-fetoprotein (p<0.0001) and a composite model yielded an increased AUC of 0.93. CONCLUSION: These findings directly lead to the prospect of a minimally invasive, blood-only, operator-independent clinical tool for HCC surveillance, thus highlighting the potential of unannotated smRCs for biomarker research in cancer.

11.
Cell Syst ; 12(8): 827-838.e5, 2021 08 18.
Article in English | MEDLINE | ID: mdl-34146471

ABSTRACT

The accurate identification and quantitation of RNA isoforms present in the cancer transcriptome is key for analyses ranging from the inference of the impacts of somatic variants to pathway analysis to biomarker development and subtype discovery. The ICGC-TCGA DREAM Somatic Mutation Calling in RNA (SMC-RNA) challenge was a crowd-sourced effort to benchmark methods for RNA isoform quantification and fusion detection from bulk cancer RNA sequencing (RNA-seq) data. It concluded in 2018 with a comparison of 77 fusion detection entries and 65 isoform quantification entries on 51 synthetic tumors and 32 cell lines with spiked-in fusion constructs. We report the entries used to build this benchmark, the leaderboard results, and the experimental features associated with the accurate prediction of RNA species. This challenge required submissions to be in the form of containerized workflows, meaning each of the entries described is easily reusable through CWL and Docker containers at https://github.com/SMC-RNA-challenge. A record of this paper's transparent peer review process is included in the supplemental information.


Subject(s)
Neoplasms , Humans , Neoplasms/genetics , Protein Isoforms/genetics , RNA/genetics , RNA-Seq , Sequence Analysis, RNA
12.
Nat Commun ; 12(1): 3307, 2021 06 03.
Article in English | MEDLINE | ID: mdl-34083538

ABSTRACT

Despite decades of intensive search for compounds that modulate the activity of particular protein targets, a large proportion of the human kinome remains as yet undrugged. Effective approaches are therefore required to map the massive space of unexplored compound-kinase interactions for novel and potent activities. Here, we carry out a crowdsourced benchmarking of predictive algorithms for kinase inhibitor potencies across multiple kinase families tested on unpublished bioactivity data. We find the top-performing predictions are based on various models, including kernel learning, gradient boosting and deep learning, and their ensemble leads to a predictive accuracy exceeding that of single-dose kinase activity assays. We design experiments based on the model predictions and identify unexpected activities even for under-studied kinases, thereby accelerating experimental mapping efforts. The open-source prediction algorithms together with the bioactivities between 95 compounds and 295 kinases provide a resource for benchmarking prediction algorithms and for extending the druggable kinome.


Subject(s)
Protein Kinase Inhibitors/pharmacology , Protein Kinases/metabolism , Algorithms , Benchmarking , Crowdsourcing , Databases, Pharmaceutical , Deep Learning , Drug Discovery , Drug Evaluation, Preclinical , Humans , Kinetics , Machine Learning , Models, Biological , Models, Chemical , Protein Kinase Inhibitors/chemistry , Protein Kinase Inhibitors/pharmacokinetics , Protein Kinases/chemistry , Proteomics , Regression Analysis
13.
EBioMedicine ; 66: 103275, 2021 Apr.
Article in English | MEDLINE | ID: mdl-33745882

ABSTRACT

BACKGROUND: Assistive automatic seizure detection can empower human annotators to shorten patient monitoring data review times. We present a proof-of-concept for a seizure detection system that is sensitive, automated, patient-specific, and tunable to maximise sensitivity while minimizing human annotation times. The system uses custom data preparation methods, deep learning analytics and electroencephalography (EEG) data. METHODS: Scalp EEG data of 365 patients containing 171,745 s ictal and 2,185,864 s interictal samples obtained from clinical monitoring systems were analysed as part of a crowdsourced artificial intelligence (AI) challenge. Participants were tasked to develop an ictal/interictal classifier with high sensitivity and low false alarm rates. We built a challenge platform that prevented participants from downloading or directly accessing the data while allowing crowdsourced model development. FINDINGS: The automatic detection system achieved tunable sensitivities between 75.00% and 91.60% allowing a reduction in the amount of raw EEG data to be reviewed by a human annotator by factors between 142x, and 22x respectively. The algorithm enables instantaneous reviewer-managed optimization of the balance between sensitivity and the amount of raw EEG data to be reviewed. INTERPRETATION: This study demonstrates the utility of deep learning for patient-specific seizure detection in EEG data. Furthermore, deep learning in combination with a human reviewer can provide the basis for an assistive data labelling system lowering the time of manual review while maintaining human expert annotation performance. FUNDING: IBM employed all IBM Research authors. Temple University employed all Temple University authors. The Icahn School of Medicine at Mount Sinai employed Eren Ahsen. The corresponding authors Stefan Harrer and Gustavo Stolovitzky declare that they had full access to all the data in the study and that they had final responsibility for the decision to submit for publication.


Subject(s)
Artificial Intelligence , Brain/physiopathology , Electroencephalography , Neurologists , Seizures/diagnosis , Algorithms , Data Analysis , Deep Learning , Electroencephalography/methods , Electroencephalography/standards , Epilepsy/diagnosis , Humans , Reproducibility of Results
14.
Front Genet ; 12: 778416, 2021.
Article in English | MEDLINE | ID: mdl-35047007

ABSTRACT

We now know RNA can survive the harsh environment of biofluids when encapsulated in vesicles or by associating with lipoproteins or RNA binding proteins. These extracellular RNA (exRNA) play a role in intercellular signaling, serve as biomarkers of disease, and form the basis of new strategies for disease treatment. The Extracellular RNA Communication Consortium (ERCC) hosted a two-day online workshop (April 19-20, 2021) on the unique challenges of exRNA data analysis. The goal was to foster an open dialog about best practices and discuss open problems in the field, focusing initially on small exRNA sequencing data. Video recordings of workshop presentations and discussions are available (https://exRNA.org/exRNAdata2021-videos/). There were three target audiences: experimentalists who generate exRNA sequencing data, computational and data scientists who work with those groups to analyze their data, and experimental and data scientists new to the field. Here we summarize issues explored during the workshop, including progress on an effort to develop an exRNA data analysis challenge to engage the community in solving some of these open problems.

15.
Bioinformatics ; 37(14): 2070-2072, 2021 08 04.
Article in English | MEDLINE | ID: mdl-33241320

ABSTRACT

SUMMARY: The advent of high-throughput technologies has provided researchers with measurements of thousands of molecular entities and enable the investigation of the internal regulatory apparatus of the cell. However, network inference from high-throughput data is far from being a solved problem. While a plethora of different inference methods have been proposed, they often lead to non-overlapping predictions, and many of them lack user-friendly implementations to enable their broad utilization. Here, we present Consensus Interaction Network Inference Service (COSIFER), a package and a companion web-based platform to infer molecular networks from expression data using state-of-the-art consensus approaches. COSIFER includes a selection of state-of-the-art methodologies for network inference and different consensus strategies to integrate the predictions of individual methods and generate robust networks. AVAILABILITY AND IMPLEMENTATION: COSIFER Python source code is available at https://github.com/PhosphorylatedRabbits/cosifer. The web service is accessible at https://ibm.biz/cosifer-aas. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Subject(s)
Software , Consensus
16.
Life Sci Alliance ; 3(11)2020 11.
Article in English | MEDLINE | ID: mdl-32972997

ABSTRACT

Single-cell RNA-sequencing (scRNAseq) technologies are rapidly evolving. Although very informative, in standard scRNAseq experiments, the spatial organization of the cells in the tissue of origin is lost. Conversely, spatial RNA-seq technologies designed to maintain cell localization have limited throughput and gene coverage. Mapping scRNAseq to genes with spatial information increases coverage while providing spatial location. However, methods to perform such mapping have not yet been benchmarked. To fill this gap, we organized the DREAM Single-Cell Transcriptomics challenge focused on the spatial reconstruction of cells from the Drosophila embryo from scRNAseq data, leveraging as silver standard, genes with in situ hybridization data from the Berkeley Drosophila Transcription Network Project reference atlas. The 34 participating teams used diverse algorithms for gene selection and location prediction, while being able to correctly localize clusters of cells. Selection of predictor genes was essential for this task. Predictor genes showed a relatively high expression entropy, high spatial clustering and included prominent developmental genes such as gap and pair-rule genes and tissue markers. Application of the top 10 methods to a zebra fish embryo dataset yielded similar performance and statistical properties of the selected genes than in the Drosophila data. This suggests that methods developed in this challenge are able to extract generalizable properties of genes that are useful to accurately reconstruct the spatial arrangement of cells in tissues.


Subject(s)
Computational Biology/methods , Gene Expression Profiling/methods , Single-Cell Analysis/methods , Spatial Analysis , Algorithms , Animals , Databases, Genetic , Drosophila/genetics , Forecasting/methods , Gene Expression Regulation, Developmental/genetics , Gene Regulatory Networks/genetics , Sequence Analysis, RNA/methods , Transcriptome/genetics , Zebrafish/genetics
17.
Elife ; 92020 09 18.
Article in English | MEDLINE | ID: mdl-32945258

ABSTRACT

Our ability to discover effective drug combinations is limited, in part by insufficient understanding of how the transcriptional response of two monotherapies results in that of their combination. We analyzed matched time course RNAseq profiling of cells treated with single drugs and their combinations and found that the transcriptional signature of the synergistic combination was unique relative to that of either constituent monotherapy. The sequential activation of transcription factors in time in the gene regulatory network was implicated. The nature of this transcriptional cascade suggests that drug synergy may ensue when the transcriptional responses elicited by two unrelated individual drugs are correlated. We used these results as the basis of a simple prediction algorithm attaining an AUROC of 0.77 in the prediction of synergistic drug combinations in an independent dataset.


Subject(s)
Drug Combinations , Drug Synergism , Gene Expression , Gene Regulatory Networks/physiology , Transcriptome , Algorithms , Computational Biology , Humans , MCF-7 Cells , RNA-Seq , Transcription Factors/metabolism
18.
ACS Nano ; 14(9): 10784-10795, 2020 09 22.
Article in English | MEDLINE | ID: mdl-32844655

ABSTRACT

The advent of microfluidics in the 1990s promised a revolution in multiple industries from healthcare to chemical processing. Deterministic lateral displacement (DLD) is a continuous-flow microfluidic particle separation method discovered in 2004 that has been applied successfully and widely to the separation of blood cells, yeast, spores, bacteria, viruses, DNA, droplets, and more. Deterministic lateral displacement is conceptually simple and can deliver consistent performance over a wide range of flow rates and particle concentrations. Despite wide use and in-depth study, DLD has not yet been fully elucidated or optimized, with different approaches to the same problem yielding varying results. We endeavor here to provide up-to-date expert opinion on the state-of-art and current fundamental, practical, and commercial challenges with DLD as well as describe experimental and modeling opportunities. Because these challenges and opportunities arise from constraints on hydrodynamics, fabrication, and operation at the micro- and nanoscale, we expect this Perspective to serve as a guide for the broader micro- and nanofluidic community to identify and to address open questions in the field.


Subject(s)
Microfluidic Analytical Techniques , Hydrodynamics , Microfluidics
19.
Cell Syst ; 11(2): 186-195.e9, 2020 08 26.
Article in English | MEDLINE | ID: mdl-32710834

ABSTRACT

Cancer is driven by genomic alterations, but the processes causing this disease are largely performed by proteins. However, proteins are harder and more expensive to measure than genes and transcripts. To catalyze developments of methods to infer protein levels from other omics measurements, we leveraged crowdsourcing via the NCI-CPTAC DREAM proteogenomic challenge. We asked for methods to predict protein and phosphorylation levels from genomic and transcriptomic data in cancer patients. The best performance was achieved by an ensemble of models, including as predictors transcript level of the corresponding genes, interaction between genes, conservation across tumor types, and phosphosite proximity for phosphorylation prediction. Proteins from metabolic pathways and complexes were the best and worst predicted, respectively. The performance of even the best-performing model was modest, suggesting that many proteins are strongly regulated through translational control and degradation. Our results set a reference for the limitations of computational inference in proteogenomics. A record of this paper's transparent peer review process is included in the Supplemental Information.


Subject(s)
Crowdsourcing/methods , Genomics/methods , Machine Learning/standards , Neoplasms/genetics , Phosphoproteins/metabolism , Proteins/genetics , Proteomics/methods , Transcriptome/genetics , Female , Humans , Male
20.
JAMA Netw Open ; 3(3): e200265, 2020 03 02.
Article in English | MEDLINE | ID: mdl-32119094

ABSTRACT

Importance: Mammography screening currently relies on subjective human interpretation. Artificial intelligence (AI) advances could be used to increase mammography screening accuracy by reducing missed cancers and false positives. Objective: To evaluate whether AI can overcome human mammography interpretation limitations with a rigorous, unbiased evaluation of machine learning algorithms. Design, Setting, and Participants: In this diagnostic accuracy study conducted between September 2016 and November 2017, an international, crowdsourced challenge was hosted to foster AI algorithm development focused on interpreting screening mammography. More than 1100 participants comprising 126 teams from 44 countries participated. Analysis began November 18, 2016. Main Outcomes and Measurements: Algorithms used images alone (challenge 1) or combined images, previous examinations (if available), and clinical and demographic risk factor data (challenge 2) and output a score that translated to cancer yes/no within 12 months. Algorithm accuracy for breast cancer detection was evaluated using area under the curve and algorithm specificity compared with radiologists' specificity with radiologists' sensitivity set at 85.9% (United States) and 83.9% (Sweden). An ensemble method aggregating top-performing AI algorithms and radiologists' recall assessment was developed and evaluated. Results: Overall, 144 231 screening mammograms from 85 580 US women (952 cancer positive ≤12 months from screening) were used for algorithm training and validation. A second independent validation cohort included 166 578 examinations from 68 008 Swedish women (780 cancer positive). The top-performing algorithm achieved an area under the curve of 0.858 (United States) and 0.903 (Sweden) and 66.2% (United States) and 81.2% (Sweden) specificity at the radiologists' sensitivity, lower than community-practice radiologists' specificity of 90.5% (United States) and 98.5% (Sweden). Combining top-performing algorithms and US radiologist assessments resulted in a higher area under the curve of 0.942 and achieved a significantly improved specificity (92.0%) at the same sensitivity. Conclusions and Relevance: While no single AI algorithm outperformed radiologists, an ensemble of AI algorithms combined with radiologist assessment in a single-reader screening environment improved overall accuracy. This study underscores the potential of using machine learning methods for enhancing mammography screening interpretation.


Subject(s)
Breast Neoplasms/diagnostic imaging , Deep Learning , Image Interpretation, Computer-Assisted/methods , Mammography/methods , Radiologists , Adult , Aged , Algorithms , Artificial Intelligence , Early Detection of Cancer , Female , Humans , Middle Aged , Radiology , Sensitivity and Specificity , Sweden , United States
SELECTION OF CITATIONS
SEARCH DETAIL
...