Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 1.431
Filter
1.
BMC Genomics ; 25(1): 679, 2024 Jul 08.
Article in English | MEDLINE | ID: mdl-38978005

ABSTRACT

BACKGROUND: Oxford Nanopore provides high throughput sequencing platforms able to reconstruct complete bacterial genomes with 99.95% accuracy. However, even small levels of error can obscure the phylogenetic relationships between closely related isolates. Polishing tools have been developed to correct these errors, but it is uncertain if they obtain the accuracy needed for the high-resolution source tracking of foodborne illness outbreaks. RESULTS: We tested 132 combinations of assembly and short- and long-read polishing tools to assess their accuracy for reconstructing the genome sequences of 15 highly similar Salmonella enterica serovar Newport isolates from a 2020 onion outbreak. While long-read polishing alone improved accuracy, near perfect accuracy (99.9999% accuracy or ~ 5 nucleotide errors across the 4.8 Mbp genome, excluding low confidence regions) was only obtained by pipelines that combined both long- and short-read polishing tools. Notably, medaka was a more accurate and efficient long-read polisher than Racon. Among short-read polishers, NextPolish showed the highest accuracy, but Pilon, Polypolish, and POLCA performed similarly. Among the 5 best performing pipelines, polishing with medaka followed by NextPolish was the most common combination. Importantly, the order of polishing tools mattered i.e., using less accurate tools after more accurate ones introduced errors. Indels in homopolymers and repetitive regions, where the short reads could not be uniquely mapped, remained the most challenging errors to correct. CONCLUSIONS: Short reads are still needed to correct errors in nanopore sequenced assemblies to obtain the accuracy required for source tracking investigations. Our granular assessment of the performance of the polishing pipelines allowed us to suggest best practices for tool users and areas for improvement for tool developers.


Subject(s)
Benchmarking , Disease Outbreaks , Genome, Bacterial , Nanopores , Nanopore Sequencing/methods , High-Throughput Nucleotide Sequencing/methods , Salmonella enterica/genetics , Salmonella enterica/isolation & purification , Humans , Phylogeny
2.
J Proteome Res ; 2024 Jul 09.
Article in English | MEDLINE | ID: mdl-38978496

ABSTRACT

Data-independent acquisition (DIA) techniques such as sequential window acquisition of all theoretical mass spectra (SWATH) acquisition have emerged as the preferred strategies for proteomic analyses. Our study optimized the SWATH-DIA method using a narrow isolation window placement approach, improving its proteomic performance. We optimized the acquisition parameter combinations of narrow isolation windows with different widths (1.9 and 2.9 Da) on a ZenoTOF 7600 (Sciex); the acquired data were analyzed using DIA-NN (version 1.8.1). Narrow SWATH (nSWATH) identified 5916 and 7719 protein groups on the digested peptides, corresponding to 400 ng of protein from mouse liver and HEK293T cells, respectively, improving identification by 7.52 and 4.99%, respectively, compared to conventional SWATH. The median coefficient of variation of the quantified values was less than 6%. We further analyzed 200 ng of benchmark samples comprising peptides from known ratios ofEscherichia coli, yeast, and human peptides using nSWATH. Consequently, it achieved accuracy and precision comparable to those of conventional SWATH, identifying an average of 95,456 precursors and 9342 protein groups across three benchmark samples, representing 12.6 and 9.63% improved identification compared to conventional SWATH. The nSWATH method improved identification at various loading amounts of benchmark samples, identifying 40.7% more protein groups at 25 ng. These results demonstrate the improved performance of nSWATH, contributing to the acquisition of deeper proteomic data from complex biological samples.

3.
Mol Ecol Resour ; : e13991, 2024 Jul 09.
Article in English | MEDLINE | ID: mdl-38979877

ABSTRACT

The use of short-read metabarcoding for classifying microeukaryotes is challenged by the lack of comprehensive 18S rRNA reference databases. While recent advances in high-throughput long-read sequencing provide the potential to greatly increase the phylogenetic coverage of these databases, the performance of different sequencing technologies and subsequent bioinformatics processing remain to be evaluated, primarily because of the absence of well-defined eukaryotic mock communities. To address this challenge, we created a eukaryotic rRNA operon clone-library and turned it into a precisely defined synthetic eukaryotic mock community. This mock community was then used to evaluate the performance of three long-read sequencing strategies (PacBio circular consensus sequencing and two Nanopore approaches using unique molecular identifiers) and three tools for resolving amplicons sequence variants (ASVs) (USEARCH, VSEARCH, and DADA2). We investigated the sensitivity of the sequencing techniques based on the number of detected mock taxa, and the accuracy of the different ASV-calling tools with a specific focus on the presence of chimera among the final rRNA operon ASVs. Based on our findings, we provide recommendations and best practice protocols for how to cost-effectively obtain essentially error-free rRNA operons in high-throughput. An agricultural soil sample was used to demonstrate that the sequencing and bioinformatic results from the mock community also translates to highly diverse natural samples, which enables us to identify previously undescribed microeukaryotic lineages.

4.
BMC Public Health ; 24(1): 1790, 2024 Jul 05.
Article in English | MEDLINE | ID: mdl-38970046

ABSTRACT

BACKGROUND: Aboriginal and Torres Strait Islander communities in remote Australia have initiated bold policies for health-enabling stores. Benchmarking, a data-driven and facilitated 'audit and feedback' with action planning process, provides a potential strategy to strengthen and scale health-enabling best-practice adoption by remote community store directors/owners. We aim to co-design a benchmarking model with five partner organisations and test its effectiveness with Aboriginal and Torres Strait Islander community stores in remote Australia. METHODS: Study design is a pragmatic randomised controlled trial with consenting eligible stores (located in very remote Northern Territory (NT) of Australia, primary grocery store for an Aboriginal community, and serviced by a Nutrition Practitioner with a study partner organisation). The Benchmarking model is informed by research evidence, purpose-built best-practice audit and feedback tools, and co-designed with partner organisation and community representatives. The intervention comprises two full benchmarking cycles (one per year, 2022/23 and 2023/24) of assessment, feedback, action planning and action implementation. Assessment of stores includes i adoption status of 21 evidence-and industry-informed health-enabling policies for remote stores, ii implementation of health-enabling best-practice using a purpose-built Store Scout App, iii price of a standardised healthy diet using the Aboriginal and Torres Strait Islander Healthy Diets ASAP protocol; and, iv healthiness of food purchasing using sales data indicators. Partner organisations feedback reports and co-design action plans with stores. Control stores receive assessments and continue with usual retail practice. All stores provide weekly electronic sales data to assess the primary outcome, change in free sugars (g) to energy (MJ) from all food and drinks purchased, baseline (July-December 2021) vs July-December 2023. DISCUSSION: We hypothesise that the benchmarking intervention can improve the adoption of health-enabling store policy and practice and reduce sales of unhealthy foods and drinks in remote community stores of Australia. This innovative research with remote Aboriginal and Torres Strait Islander communities can inform effective implementation strategies for healthy food retail more broadly. TRIAL REGISTRATION: ACTRN12622000596707, Protocol version 1.


Subject(s)
Benchmarking , Diet, Healthy , Food Supply , Humans , Australia , Australian Aboriginal and Torres Strait Islander Peoples , Commerce , Food Supply/standards , Rural Population , Randomized Controlled Trials as Topic
5.
Genome Biol ; 25(1): 172, 2024 07 01.
Article in English | MEDLINE | ID: mdl-38951922

ABSTRACT

BACKGROUND: Computational variant effect predictors offer a scalable and increasingly reliable means of interpreting human genetic variation, but concerns of circularity and bias have limited previous methods for evaluating and comparing predictors. Population-level cohorts of genotyped and phenotyped participants that have not been used in predictor training can facilitate an unbiased benchmarking of available methods. Using a curated set of human gene-trait associations with a reported rare-variant burden association, we evaluate the correlations of 24 computational variant effect predictors with associated human traits in the UK Biobank and All of Us cohorts. RESULTS: AlphaMissense outperformed all other predictors in inferring human traits based on rare missense variants in UK Biobank and All of Us participants. The overall rankings of computational variant effect predictors in these two cohorts showed a significant positive correlation. CONCLUSION: We describe a method to assess computational variant effect predictors that sidesteps the limitations of previous evaluations. This approach is generalizable to future predictors and could continue to inform predictor choice for personal and clinical genetics.


Subject(s)
Benchmarking , Genetic Variation , Humans , Phenotype , Computational Biology/methods , Genotype
6.
Chest ; 2024 Jul 02.
Article in English | MEDLINE | ID: mdl-38964673

ABSTRACT

BACKGROUND: When comparing outcomes after sepsis, it is essential to account for patient case mix to make fair comparisons. We developed a model to assess risk-adjusted 30-day mortality in the Michigan Hospital Medicine Safety's sepsis initiative (HMS-Sepsis). QUESTION: Can HMS-Sepsis registry data adequately predict risk of 30-day mortality? Do performance assessments using adjusted vs unadjusted data differ? STUDY DESIGN AND METHODS: Retrospective cohort of community-onset sepsis hospitalizations in HMS-Sepsis registry (4/2022-9/2023), with split derivation (70%) and validation (30%) cohorts. We fit a risk-adjustment model (HMS-Sepsis mortality model) incorporating acute physiology, demographic, and baseline health data and assessed model performance using c-statistics, Brier's scores, and comparisons of predicted vs observed mortality by deciles of risk. We compared hospital performance (1st quintile, middle quintiles, 5th quintile) using observed versus adjusted mortality to understand the extent to which risk-adjustment impacted hospital performance assessment. RESULTS: Among 17,514 hospitalizations from 66 hospitals during the study period, 12,260 (70%) were used for model derivation and 5,254 (30%) for model validation. 30-day mortality for the total cohort was 19.4%. The final model included 13 physiologic variables, two physiologic interactions, and 16 demographic and chronic health variables. The most significant variables were age, metastatic solid tumor, temperature, altered mental status, and platelet count. The model c-statistic was 0.82 for the derivation cohort, 0.81 for the validation cohort, and ≥0.78 for all subgroups assessed. Overall calibration error was 0.0% and mean calibration error across deciles of risk was 1.5%. Standardized mortality ratios yielded different assessments than observed mortality for 33.9% of hospitals. CONCLUSIONS: The HMS-Sepsis mortality model had strong discrimination, adequate calibration, and reclassified one-third of hospitals to a different performance category from unadjusted mortality. Based on its strong performance, the HMS-Sepsis mortality model can aid in fair hospital benchmarking, assessment of temporal changes, and observational causal inference analysis.

7.
IUCrJ ; 2024 Sep 01.
Article in English | MEDLINE | ID: mdl-38989800

ABSTRACT

Stimulated by informal conversations at the XVII International Small Angle Scattering (SAS) conference (Traverse City, 2017), an international team of experts undertook a round-robin exercise to produce a large dataset from proteins under standard solution conditions. These data were used to generate consensus SAS profiles for xylose isomerase, urate oxidase, xylanase, lysozyme and ribonuclease A. Here, we apply a new protocol using maximum likelihood with a larger number of the contributed datasets to generate improved consensus profiles. We investigate the fits of these profiles to predicted profiles from atomic coordinates that incorporate different models to account for the contribution to the scattering of water molecules of hydration surrounding proteins in solution. Programs using an implicit, shell-type hydration layer generally optimize fits to experimental data with the aid of two parameters that adjust the volume of the bulk solvent excluded by the protein and the contrast of the hydration layer. For these models, we found the error-weighted residual differences between the model and the experiment generally reflected the subsidiary maxima and minima in the consensus profiles that are determined by the size of the protein plus the hydration layer. By comparison, all-atom solute and solvent molecular dynamics (MD) simulations are without the benefit of adjustable parameters and, nonetheless, they yielded at least equally good fits with residual differences that are less reflective of the structure in the consensus profile. Further, where MD simulations accounted for the precise solvent composition of the experiment, specifically the inclusion of ions, the modelled radius of gyration values were significantly closer to the experiment. The power of adjustable parameters to mask real differences between a model and the structure present in solution is demonstrated by the results for the conformationally dynamic ribonuclease A and calculations with pseudo-experimental data. This study shows that, while methods invoking an implicit hydration layer have the unequivocal advantage of speed, care is needed to understand the influence of the adjustable parameters. All-atom solute and solvent MD simulations are slower but are less susceptible to false positives, and can account for thermal fluctuations in atomic positions, and more accurately represent the water molecules of hydration that contribute to the scattering profile.

8.
Hum Brain Mapp ; 45(10): e26768, 2024 Jul 15.
Article in English | MEDLINE | ID: mdl-38949537

ABSTRACT

Structural neuroimaging data have been used to compute an estimate of the biological age of the brain (brain-age) which has been associated with other biologically and behaviorally meaningful measures of brain development and aging. The ongoing research interest in brain-age has highlighted the need for robust and publicly available brain-age models pre-trained on data from large samples of healthy individuals. To address this need we have previously released a developmental brain-age model. Here we expand this work to develop, empirically validate, and disseminate a pre-trained brain-age model to cover most of the human lifespan. To achieve this, we selected the best-performing model after systematically examining the impact of seven site harmonization strategies, age range, and sample size on brain-age prediction in a discovery sample of brain morphometric measures from 35,683 healthy individuals (age range: 5-90 years; 53.59% female). The pre-trained models were tested for cross-dataset generalizability in an independent sample comprising 2101 healthy individuals (age range: 8-80 years; 55.35% female) and for longitudinal consistency in a further sample comprising 377 healthy individuals (age range: 9-25 years; 49.87% female). This empirical examination yielded the following findings: (1) the accuracy of age prediction from morphometry data was higher when no site harmonization was applied; (2) dividing the discovery sample into two age-bins (5-40 and 40-90 years) provided a better balance between model accuracy and explained age variance than other alternatives; (3) model accuracy for brain-age prediction plateaued at a sample size exceeding 1600 participants. These findings have been incorporated into CentileBrain (https://centilebrain.org/#/brainAGE2), an open-science, web-based platform for individualized neuroimaging metrics.


Subject(s)
Aging , Brain , Magnetic Resonance Imaging , Humans , Adolescent , Female , Aged , Adult , Child , Young Adult , Male , Brain/diagnostic imaging , Brain/anatomy & histology , Brain/growth & development , Aged, 80 and over , Child, Preschool , Middle Aged , Aging/physiology , Magnetic Resonance Imaging/methods , Neuroimaging/methods , Neuroimaging/standards , Sample Size
9.
Data Brief ; 54: 110271, 2024 Jun.
Article in English | MEDLINE | ID: mdl-38962205

ABSTRACT

ArzEn-MultiGenre is a parallel dataset of Egyptian Arabic song lyrics, novels, and TV show subtitles that are manually translated and aligned with their English counterparts. The dataset contains 25,557 segment pairs that can be used to benchmark new machine translation models, fine-tune large language models in few-shot settings, and adapt commercial machine translation applications such as Google Translate. Additionally, the dataset is a valuable resource for research in various disciplines, including translation studies, cross-linguistic analysis, and lexical semantics. The dataset can also serve pedagogical purposes by training translation students and aid professional translators as a translation memory. The contributions are twofold: first, the dataset features textual genres not found in existing parallel Egyptian Arabic and English datasets, and second, it is a gold-standard dataset that has been translated and aligned by human experts.

10.
J Environ Manage ; 366: 121676, 2024 Jul 06.
Article in English | MEDLINE | ID: mdl-38972187

ABSTRACT

The challenges posed by unsustainable practices in today's economy underscore the urgent need for a transition toward a circular economy (CE) and a holistic supply chain (SC) perspective. Benchmarking plays a pivotal role in managing circular SCs, offering a metric to gauge progress. However, the lack of consensus on the optimal benchmarking approach hampers effective implementation of circular business practices. To address this gap, we conducted a systematic review of the literature, identifying 29 pertinent publications. The analysis revealed 30 unique attributes and sub-attributes for benchmarking circularity, which were clustered into five main attributes. The main attributes are goals, subjects, key performance indicators (KPIs), data sources, and evaluation methods, while the sub-attributes are organised as features of the main attributes and depicted as a feature model. Drawing from selected publications, we illustrated each feature with examples. Our model offers a comprehensive benchmarking reference for circularity and will be a valuable tool for managers in the transition toward circularity. Supply chains seeking to benchmark their transition to circularity can apply the reference model to ensure that their benchmarking strategy is consistent with state-of-the-art knowledge. By providing a generic circularity benchmarking approach that is valid for diverse economic sectors, our findings contribute to theoretical efforts to address the lack of generic frameworks for CE.

11.
Circ Cardiovasc Qual Outcomes ; : e010637, 2024 Jun 18.
Article in English | MEDLINE | ID: mdl-38887950

ABSTRACT

BACKGROUND: Cardiogenic shock is a morbid complication of heart disease that claims the lives of more than 1 in 3 patients presenting with this syndrome. Supporting a unique collaboration across clinical specialties, federal regulators, payors, and industry, the American Heart Association volunteers and staff have launched a quality improvement registry to better understand the clinical manifestations of shock phenotypes, and to benchmark the management patterns, and outcomes of patients presenting with cardiogenic shock to hospitals across the United States. METHODS: Participating hospitals will enroll consecutive hospitalized patients with cardiogenic shock, regardless of etiology or severity. Data are collected through individual reviews of medical records of sequential adult patients with cardiogenic shock. The electronic case record form was collaboratively designed with a core minimum data structure and aligned with Shock Academic Research Consortium definitions. This registry will allow participating health systems to evaluate patient-level data including diagnostic approaches, therapeutics, use of advanced monitoring and circulatory support, processes of care, complications, and in-hospital survival. Participating sites can leverage these data for onsite monitoring of outcomes and benchmarking versus other institutions. The registry was concomitantly designed to provide a high-quality longitudinal research infrastructure for pragmatic randomized trials as well as translational, clinical, and implementation research. An aggregate deidentified data set will be made available to the research community on the American Heart Association's Precision Medicine Platform. On March 31, 2022, the American Heart Association Cardiogenic Shock Registry received its first clinical records. At the time of this submission, 100 centers are participating. CONCLUSIONS: The American Heart Association Cardiogenic Shock Registry will serve as a resource using consistent data structure and definitions for the medical and research community to accelerate scientific advancement through shared learning and research resulting in improved quality of care and outcomes of shock patients.

12.
Am J Epidemiol ; 2024 Jun 18.
Article in English | MEDLINE | ID: mdl-38896054

ABSTRACT

Cardiovascular disease (CVD) is a leading cause of death globally. Angiotensin-converting enzyme inhibitors (ACEi) and angiotensin receptor blockers (ARB), compared in the ONTARGET trial, each prevent CVD. However, trial results may not be generalisable and their effectiveness in underrepresented groups is unclear. Using trial emulation methods within routine-care data to validate findings, we explored generalisability of ONTARGET results. For people prescribed an ACEi/ARB in the UK Clinical Practice Research Datalink GOLD from 1/1/2001-31/7/2019, we applied trial criteria and propensity-score methods to create an ONTARGET trial-eligible cohort. Comparing ARB to ACEi, we estimated hazard ratios for the primary composite trial outcome (cardiovascular death, myocardial infarction, stroke, or hospitalisation for heart failure), and secondary outcomes. As the pre-specified criteria were met confirming trial emulation, we then explored treatment heterogeneity among three trial-underrepresented subgroups: females, those aged ≥75 years and those with chronic kidney disease (CKD). In the trial-eligible population (n=137,155), results for the primary outcome demonstrated similar effects of ARB and ACEi, (HR 0.97 [95% CI: 0.93, 1.01]), meeting the pre-specified validation criteria. When extending this outcome to trial-underrepresented groups, similar treatment effects were observed by sex, age and CKD. This suggests that ONTARGET trial findings are generalisable to trial-underrepresented subgroups.

13.
Front Radiol ; 4: 1386906, 2024.
Article in English | MEDLINE | ID: mdl-38836218

ABSTRACT

Introduction: This study is a retrospective evaluation of the performance of deep learning models that were developed for the detection of COVID-19 from chest x-rays, undertaken with the goal of assessing the suitability of such systems as clinical decision support tools. Methods: Models were trained on the National COVID-19 Chest Imaging Database (NCCID), a UK-wide multi-centre dataset from 26 different NHS hospitals and evaluated on independent multi-national clinical datasets. The evaluation considers clinical and technical contributors to model error and potential model bias. Model predictions are examined for spurious feature correlations using techniques for explainable prediction. Results: Models performed adequately on NHS populations, with performance comparable to radiologists, but generalised poorly to international populations. Models performed better in males than females, and performance varied across age groups. Alarmingly, models routinely failed when applied to complex clinical cases with confounding pathologies and when applied to radiologist defined "mild" cases. Discussion: This comprehensive benchmarking study examines the pitfalls in current practices that have led to impractical model development. Key findings highlight the need for clinician involvement at all stages of model development, from data curation and label definition, to model evaluation, to ensure that all clinical factors and disease features are appropriately considered during model design. This is imperative to ensure automated approaches developed for disease detection are fit-for-purpose in a clinical setting.

14.
Sci Rep ; 14(1): 13406, 2024 Jun 11.
Article in English | MEDLINE | ID: mdl-38862672

ABSTRACT

This article investigates an inventive methodology for precisely and efficiently controlling photovoltaic emulating (PVE) prototypes, which are employed in the assessment of solar systems. A modification to the Shift controller (SC), which is regarded as a leading PVE controller, is proposed. In addition to efficiency and accuracy, the novel controller places a high emphasis on improving transient performance. The novel piecewise linear-logarithmic adaptation utilized by the Modified-Shift controller (M-SC) enables the controller to linearly adapt to the load burden within a specified operating range. At reduced load resistances, the transient sped of the PVE can be increased through the implementation of this scheme. An exceedingly short settling time of the PVE is ensured by a logarithmic modification of the control action beyond the critical point. In order to analyze the M-SC in the context of PVE control, numerical investigations implemented in MATLAB/Simulink (Version: Simulink 10.4, URL: https://in.mathworks.com/products/simulink.html ) were utilized. To assess the effectiveness of the suggested PVE, three benchmarking profiles are presented: eight scenarios involving irradiance/PVE load, continuously varying irradiance/temperature, and rapidly changing loads. These profiles include metrics such as settling time, efficiency, Integral of Absolute Error (IAE), and percentage error (epve). As suggested, the M-SC attains an approximate twofold increase in speed over the conventional SC, according to the findings. This is substantiated by an efficiency increase of 2.2%, an expeditiousness enhancement of 5.65%, and an IAE rise of 5.65%. Based on the results of this research, the new M-SC enables the PVE to experience perpetual dynamic operation enhancement, making it highly suitable for evaluating solar systems in ever-changing environments.

15.
Res Sq ; 2024 May 21.
Article in English | MEDLINE | ID: mdl-38826386

ABSTRACT

Detecting very minor (< 1%) subpopulations using next-generation sequencing is a critical need for multiple applications including detection of drug resistant pathogens and somatic variant detection in oncology. To enable these applications, wet lab enhancements and bioinformatic error correction methods have been developed for 'sequencing by synthesis' technology to reduce its inherent sequencing error rate. A recently available sequencing approach termed 'sequencing by binding' claims to have higher base calling accuracy data "out of the box." This paper evaluates the utility of using 'sequencing by binding' for the detection of ultra-rare subpopulations down to 0.001%.

16.
Data Brief ; 54: 110543, 2024 Jun.
Article in English | MEDLINE | ID: mdl-38868385

ABSTRACT

Conifer shoots exhibit intricate geometries at an exceptionally detailed spatial scale. Describing the complete structure of a conifer shoot, which contributes to a radiation scattering pattern, has been difficult, and the previous respective components of radiative transfer models for conifer stands were rather coarse. This paper presents a dataset aimed at models and applications requiring detailed 3D representations of needle shoots. The data collection was conducted in the Järvselja RAdiation transfer Model Intercomparison (RAMI) pine stand in Estonia. The dataset includes 3-dimensional surface information on 10 shoots of two conifer species present in the stand (5 shoots per species) - Scots pine (Pinus sylvestris L.) and Norway spruce (Picea abies L. Karst.). The samples were collected on 26th July 2022, and subsequently blue light 3D photogrammetry scanning technique was used to obtain their high-resolution 3D point cloud representations. For each of these samples, the dataset comprises of a photo of the sampled shoot and its obtained 3-dimensional surface reconstruction. Scanned shoots may replace previous, artificially generated models and contribute to the more realistic representation of 3D forest representations and, consequently, more accurate estimates of related parameters and processes by radiative transfer models.

17.
Genome Biol ; 25(1): 159, 2024 06 17.
Article in English | MEDLINE | ID: mdl-38886757

ABSTRACT

BACKGROUND: The advent of single-cell RNA-sequencing (scRNA-seq) has driven significant computational methods development for all steps in the scRNA-seq data analysis pipeline, including filtering, normalization, and clustering. The large number of methods and their resulting parameter combinations has created a combinatorial set of possible pipelines to analyze scRNA-seq data, which leads to the obvious question: which is best? Several benchmarking studies compare methods but frequently find variable performance depending on dataset and pipeline characteristics. Alternatively, the large number of scRNA-seq datasets along with advances in supervised machine learning raise a tantalizing possibility: could the optimal pipeline be predicted for a given dataset? RESULTS: Here, we begin to answer this question by applying 288 scRNA-seq analysis pipelines to 86 datasets and quantifying pipeline success via a range of measures evaluating cluster purity and biological plausibility. We build supervised machine learning models to predict pipeline success given a range of dataset and pipeline characteristics. We find that prediction performance is significantly better than random and that in many cases pipelines predicted to perform well provide clustering outputs similar to expert-annotated cell type labels. We identify characteristics of datasets that correlate with strong prediction performance that could guide when such prediction models may be useful. CONCLUSIONS: Supervised machine learning models have utility for recommending analysis pipelines and therefore the potential to alleviate the burden of choosing from the near-infinite number of possibilities. Different aspects of datasets influence the predictive performance of such models which will further guide users.


Subject(s)
Benchmarking , RNA-Seq , Single-Cell Analysis , Single-Cell Analysis/methods , RNA-Seq/methods , Humans , Supervised Machine Learning , Sequence Analysis, RNA/methods , Cluster Analysis , Computational Biology/methods , Machine Learning , Animals , Single-Cell Gene Expression Analysis
18.
Cell Genom ; 4(7): 100592, 2024 Jul 10.
Article in English | MEDLINE | ID: mdl-38925122

ABSTRACT

Single-cell RNA sequencing (scRNA-seq) datasets contain true single cells, or singlets, in addition to cells that coalesce during the protocol, or doublets. Identifying singlets with high fidelity in scRNA-seq is necessary to avoid false negative and false positive discoveries. Although several methodologies have been proposed, they are typically tested on highly heterogeneous datasets and lack a priori knowledge of true singlets. Here, we leveraged datasets with synthetically introduced DNA barcodes for a hitherto unexplored application: to extract ground-truth singlets. We demonstrated the feasibility of our framework, "singletCode," to evaluate existing doublet detection methods across a range of contexts. We also leveraged our ground-truth singlets to train a proof-of-concept machine learning classifier, which outperformed other doublet detection algorithms. Our integrative framework can identify ground-truth singlets and enable robust doublet detection in non-barcoded datasets.


Subject(s)
Algorithms , DNA Barcoding, Taxonomic , Single-Cell Analysis , Single-Cell Analysis/methods , DNA Barcoding, Taxonomic/methods , Humans , Machine Learning , Sequence Analysis, RNA/methods , Animals , Single-Cell Gene Expression Analysis
19.
Cancers (Basel) ; 16(12)2024 Jun 13.
Article in English | MEDLINE | ID: mdl-38927915

ABSTRACT

BACKGROUND: Sarcomas present a unique challenge within healthcare systems due to their rarity and complex treatment requirements. This study explores the economic impact of sarcoma surgeries across three Swiss tertiary healthcare institutions, utilizing a consistent surgical approach by a single surgeon to eliminate variability in surgical expertise as a confounding factor. METHODS: By analyzing data from 356 surgeries recorded in a real-world-time data warehouse, this study assesses surgical and hospital costs relative to institutional characteristics and surgical complexity. RESULTS: Our findings reveal significant cost variations driven more by institutional resource management and pricing strategies than by surgical techniques. Surgical and total hospitalization costs were analyzed in relation to tumor dignity and complexity scores, showing that higher complexity and malignancy significantly increase costs. Interestingly, it was found that surgical costs accounted for only one-third of the total hospitalization costs, highlighting the substantial impact of non-surgical factors on the overall cost of care. CONCLUSIONS: The study underscores the need for standardized cost assessment practices and highlights the potential of predictive models in enhancing resource allocation and surgical planning. By advocating for value-based healthcare models and standardized treatment guidelines, this research contributes to more equitable and sustainable healthcare delivery for sarcoma patients. These insights affirm the necessity of including a full spectrum of care costs in value-based models to truly optimize healthcare delivery. These insights prompt a reevaluation of current policies and encourage further research across diverse geographical settings to refine cost management strategies in sarcoma treatment.

20.
Sensors (Basel) ; 24(12)2024 Jun 20.
Article in English | MEDLINE | ID: mdl-38931791

ABSTRACT

The IoT has become an integral part of the technological ecosystem that we all depend on. The increase in the number of IoT devices has also brought with it security concerns. Lightweight cryptography (LWC) has evolved to be a promising solution to improve the privacy and confidentiality aspect of IoT devices. The challenge is to choose the right algorithm from a plethora of choices. This work aims to compare three different LWC algorithms: AES-128, SPECK, and ASCON. The comparison is made by measuring various criteria such as execution time, memory utilization, latency, throughput, and security robustness of the algorithms in IoT boards with constrained computational capabilities and power. These metrics are crucial to determine the suitability and help in making informed decisions on choosing the right cryptographic algorithms to strike a balance between security and performance. Through the evaluation it is observed that SPECK exhibits better performance in resource-constrained IoT devices.

SELECTION OF CITATIONS
SEARCH DETAIL
...