Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 10 de 10
Filter
Add more filters










Publication year range
1.
Gigascience ; 132024 Jan 02.
Article in English | MEDLINE | ID: mdl-38837943

ABSTRACT

Genomic information is increasingly used to inform medical treatments and manage future disease risks. However, any personal and societal gains must be carefully balanced against the risk to individuals contributing their genomic data. Expanding our understanding of actionable genomic insights requires researchers to access large global datasets to capture the complexity of genomic contribution to diseases. Similarly, clinicians need efficient access to a patient's genome as well as population-representative historical records for evidence-based decisions. Both researchers and clinicians hence rely on participants to consent to the use of their genomic data, which in turn requires trust in the professional and ethical handling of this information. Here, we review existing and emerging solutions for secure and effective genomic information management, including storage, encryption, consent, and authorization that are needed to build participant trust. We discuss recent innovations in cloud computing, quantum-computing-proof encryption, and self-sovereign identity. These innovations can augment key developments from within the genomics community, notably GA4GH Passports and the Crypt4GH file container standard. We also explore how decentralized storage as well as the digital consenting process can offer culturally acceptable processes to encourage data contributions from ethnic minorities. We conclude that the individual and their right for self-determination needs to be put at the center of any genomics framework, because only on an individual level can the received benefits be accurately balanced against the risk of exposing private information.


Subject(s)
Genomics , Humans , Genomics/methods , Genomics/ethics , Computer Security , Cloud Computing , Informed Consent
2.
Stud Health Technol Inform ; 310: 810-814, 2024 Jan 25.
Article in English | MEDLINE | ID: mdl-38269921

ABSTRACT

Genetic data is limited and generating new datasets is often an expensive, time-consuming process, involving countless moving parts to genotype and phenotype individuals. While sharing data is beneficial for quality control and software development, privacy and security are of utmost importance. Generating synthetic data is a practical solution to mitigate the cost, time and sensitivities that hamper developers and researchers in producing and validating novel biotechnological solutions to data intensive problems. Existing methods focus on mutation frequencies at specific loci while ignoring epistatic interactions. Alternatively, programs that do consider epistasis are limited to two-way interactions or apply genomic constraints that make synthetic data generation arduous or computationally intensive. To solve this, we developed Polygenic Epistatic Phenotype Simulator (PEPS). Our tool is a probabilistic model that can generate synthetic phenotypes with a controllable level of complexity.


Subject(s)
Biotechnology , Models, Statistical , Humans , Computer Simulation , Phenotype , Genotype
3.
Stud Health Technol Inform ; 310: 820-824, 2024 Jan 25.
Article in English | MEDLINE | ID: mdl-38269923

ABSTRACT

Healthcare data is a scarce resource and access is often cumbersome. While medical software development would benefit from real datasets, the privacy of the patients is held at a higher priority. Realistic synthetic healthcare data can fill this gap by providing a dataset for quality control while at the same time preserving the patient's anonymity and privacy. Existing methods focus on American or European patient healthcare data but none is exclusively focused on the Australian population. Australia is a highly diverse country that has a unique healthcare system. To overcome this problem, we used a popular publicly available tool, Synthea, to generate disease progressions based on the Australian population. With this approach, we were able to generate 100,000 patients following Queensland (Australia) demographics.


Subject(s)
Health Facilities , Privacy , Humans , Australia , Queensland , Disease Progression
4.
Comput Struct Biotechnol J ; 21: 4354-4360, 2023.
Article in English | MEDLINE | ID: mdl-37711185

ABSTRACT

Random forests (RFs) are a widely used modelling tool capable of feature selection via a variable importance measure (VIM), however, a threshold is needed to control for false positives. In the absence of a good understanding of the characteristics of VIMs, many current approaches attempt to select features associated to the response by training multiple RFs to generate statistical power via a permutation null, by employing recursive feature elimination, or through a combination of both. However, for high-dimensional datasets these approaches become computationally infeasible. In this paper, we present RFlocalfdr, a statistical approach, built on the empirical Bayes argument of Efron, for thresholding mean decrease in impurity (MDI) importances. It identifies features significantly associated with the response while controlling the false positive rate. Using synthetic data and real-world data in health, we demonstrate that RFlocalfdr has equivalent accuracy to currently published approaches, while being orders of magnitude faster. We show that RFlocalfdr can successfully threshold a dataset of 106 datapoints, establishing its usability for large-scale datasets, like genomics. Furthermore, RFlocalfdr is compatible with any RF implementation that returns a VIM and counts, making it a versatile feature selection tool that reduces false discoveries.

5.
Eur J Epidemiol ; 38(10): 1043-1052, 2023 Oct.
Article in English | MEDLINE | ID: mdl-37555907

ABSTRACT

Periodic revisions of the international classification of diseases (ICD) ensure that the classification reflects new practices and knowledge; however, this complicates retrospective research as diagnoses are coded in different versions. For longitudinal disease trajectory studies, a crosswalk is an essential tool and a comprehensive mapping between ICD-8 and ICD-10 has until now been lacking. In this study, we map all ICD-8 morbidity codes to ICD-10 in the expanded Danish ICD version. We mapped ICD-8 codes to ICD-10, using a many-to-one system inspired by general equivalence mappings such that each ICD-8 code maps to a single ICD-10 code. Each ICD-8 code was manually and unidirectionally mapped to a single ICD-10 code based on medical setting and context. Each match was assigned a score (1 of 4 levels) reflecting the quality of the match and, if applicable, a "flag" signalling choices made in the mapping. We provide the first complete mapping of the 8596 ICD-8 morbidity codes to ICD-10 codes. All Danish ICD-8 codes representing diseases were mapped and 5106 (59.4%) achieved the highest consistency score. Only 334 (3.9%) of the ICD-8 codes received the lowest mapping consistency score. The mapping provides a scaffold for translation of ICD-8 to ICD-10, which enable longitudinal disease studies back to and 1969 in Denmark and to 1965 internationally with further adaption.

6.
PLOS Digit Health ; 2(6): e0000116, 2023 Jun.
Article in English | MEDLINE | ID: mdl-37294826

ABSTRACT

Frequent assessment of the severity of illness for hospitalized patients is essential in clinical settings to prevent outcomes such as in-hospital mortality and unplanned admission to the intensive care unit (ICU). Classical severity scores have been developed typically using relatively few patient features. Recently, deep learning-based models demonstrated better individualized risk assessments compared to classic risk scores, thanks to the use of aggregated and more heterogeneous data sources for dynamic risk prediction. We investigated to what extent deep learning methods can capture patterns of longitudinal change in health status using time-stamped data from electronic health records. We developed a deep learning model based on embedded text from multiple data sources and recurrent neural networks to predict the risk of the composite outcome of unplanned ICU transfer and in-hospital death. The risk was assessed at regular intervals during the admission for different prediction windows. Input data included medical history, biochemical measurements, and clinical notes from a total of 852,620 patients admitted to non-intensive care units in 12 hospitals in Denmark's Capital Region and Region Zealand during 2011-2016 (with a total of 2,241,849 admissions). We subsequently explained the model using the Shapley algorithm, which provides the contribution of each feature to the model outcome. The best model used all data modalities with an assessment rate of 6 hours, a prediction window of 14 days and an area under the receiver operating characteristic curve of 0.898. The discrimination and calibration obtained with this model make it a viable clinical support tool to detect patients at higher risk of clinical deterioration, providing clinicians insights into both actionable and non-actionable patient features.

7.
Sci Rep ; 11(1): 9704, 2021 05 06.
Article in English | MEDLINE | ID: mdl-33958686

ABSTRACT

Diabetic retinopathy (DR) is a leading cause of blindness and affects millions of people throughout the world. Early detection and timely checkups are key to reduce the risk of blindness. Automated grading of DR is a cost-effective way to ensure early detection and timely checkups. Deep learning or more specifically convolutional neural network (CNN)-based methods produce state-of-the-art performance in DR detection. Whilst CNN based methods have been proposed, no comparisons have been done between the extracted image features and their clinical relevance. Here we first adopt a CNN visualization strategy to discover the inherent image features involved in the CNN's decision-making process. Then, we critically analyze those features with respect to commonly known pathologies namely microaneurysms, hemorrhages and exudates, and other ocular components. We also critically analyze different CNNs by considering what image features they pick up during learning to predict and justify their clinical relevance. The experiments are executed on publicly available fundus datasets (EyePACS and DIARETDB1) achieving an accuracy of 89 ~ 95% with AUC, sensitivity and specificity of respectively 95 ~ 98%, 74 ~ 86%, and 93 ~ 97%, for disease level grading of DR. Whilst different CNNs produce consistent classification results, the rate of picked-up image features disagreement between models could be as high as 70%.


Subject(s)
Diabetic Retinopathy/diagnostic imaging , Neural Networks, Computer , Algorithms , Datasets as Topic , Deep Learning , Diabetic Retinopathy/physiopathology , Humans , Sensitivity and Specificity
8.
F1000Res ; 92020.
Article in English | MEDLINE | ID: mdl-33123346

ABSTRACT

AlignmentViewer is a web-based tool to view and analyze multiple sequence alignments of protein families. The particular strengths of AlignmentViewer include flexible visualization at different scales as well as analysis of conservation patterns and of the distribution of proteins in sequence space. The tool is directly accessible in web browsers without the need for software installation. It can handle protein families with tens of thousands of sequences and is particularly suitable for evolutionary coupling analysis, e.g. via EVcouplings.org.


Subject(s)
Proteins , Sequence Alignment , Software , Humans , Proteins/genetics , Sequence Analysis, Protein , Web Browser
9.
Nat Commun ; 11(1): 4952, 2020 10 02.
Article in English | MEDLINE | ID: mdl-33009368

ABSTRACT

We present the Danish Disease Trajectory Browser (DTB), a tool for exploring almost 25 years of data from the Danish National Patient Register. In the dataset comprising 7.2 million patients and 122 million admissions, users can identify diagnosis pairs with statistically significant directionality and combine them to linear disease trajectories. Users can search for one or more disease codes (ICD-10 classification) and explore disease progression patterns via an array of functionalities. For example, a set of linear trajectories can be merged into a disease trajectory network displaying the entire multimorbidity spectrum of a disease in a single connected graph. Using data from the Danish Register for Causes of Death mortality is also included. The tool is disease-agnostic across both rare and common diseases and is showcased by exploring multimorbidity in Down syndrome (ICD-10 code Q90) and hypertension (ICD-10 code I10). Finally, we show how search results can be customized and exported from the browser in a format of choice (i.e. JSON, PNG, JPEG and CSV).


Subject(s)
Disease Progression , Software , Algorithms , Denmark , Humans , Time Factors
10.
JAMA Dermatol ; 156(7): 780-786, 2020 07 01.
Article in English | MEDLINE | ID: mdl-32432647

ABSTRACT

Importance: Hidradenitis suppurativa (HS) is a chronic skin disease characterized by recurrent inflamed nodular lesions and is associated with multiple comorbidities; previous studies have been of cross-sectional design, and the temporal association of HS with multiple comorbidities remains undetermined. Objective: To evaluate and characterize disease trajectories in patients with HS using population-wide disease registry data. Design, Setting, and Participants: This retrospective registry-based cohort study included the entire Danish population alive between January 1, 1994, and April 10, 2018 (7 191 519 unique individuals). Among these, 14 488 Danish inhabitants were diagnosed with HS or fulfilled diagnostic criteria identified through surgical procedure codes. Exposures: Citizens of Denmark with a diagnosis code of HS as defined by International Statistical Classification of Diseases and Related Health Problems, Tenth Revision (ICD-10) or as identified through surgical procedures. Main Outcomes and Measures: Disease trajectories experienced more frequently by patients with HS than by the overall Danish population. Strength of associations between disease co-occurrences was evaluated using relative risk (RR). All significant disease pairs were tested for directionality using a binomial test, and pairs with directionality were merged into disease trajectories of 3 consecutive diseases. Numerous disease trajectories were combined into a disease progression network showing the most frequent disease paths over time for patients with HS. Results: A total of 11 929 individuals were identified by ICD-10 diagnosis codes (8392 [70.3%] female; mean [SD] age, 37.72 [13.01] years), and 2791 were identified by procedural codes (1686 [60.4%] female; mean [SD] age, 37.38 [15.83]). The set of most common temporal disease trajectories included 25 diagnoses and had a characteristic appearance in which genitourinary, respiratory, or mental and behavioral disorders preceded the diagnosis of HS and chronic obstructive pulmonary disease (604 cases [4.2%]; RR, 1.57; 95% CI, 1.55-1.59; P < .001), pneumonia (827 [5.7%]; RR, 1.18; 95% CI, 1.15-1.20; P < .001), and acute myocardial infarction (293 [2.0%]; RR, 1.37; 95% CI, 1.35-1.39; P < .001) developed after the diagnosis. Conclusions and Relevance: The findings suggest that patients with newly diagnosed HS may have a high frequency of manifest type 1 diabetes and subsequent high risk of acute myocardial infarction, pneumonia, and chronic obstructive pulmonary disease.


Subject(s)
Diabetes Mellitus, Type 1/epidemiology , Hidradenitis Suppurativa/epidemiology , Myocardial Infarction/epidemiology , Pneumonia/epidemiology , Pulmonary Disease, Chronic Obstructive/epidemiology , Adult , Comorbidity , Denmark/epidemiology , Female , Female Urogenital Diseases/epidemiology , Humans , Male , Male Urogenital Diseases/epidemiology , Mental Disorders/epidemiology , Middle Aged , Registries , Retrospective Studies , Young Adult
SELECTION OF CITATIONS
SEARCH DETAIL
...