Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 63
Filtrar
1.
Stud Health Technol Inform ; 318: 150-155, 2024 Sep 24.
Artigo em Inglês | MEDLINE | ID: mdl-39320197

RESUMO

Antimicrobial resistance (AMR) poses a significant global health threat, resulting in 4.96 million deaths in 2019, with projections reaching 10 million by 2050. This resistance, primarily due to the overuse of antibiotics, complicates the treatment of infections caused by various microorganisms, including the gram-negative bacterium Escherichia coli. Traditional culture-based methods for detecting AMR are slow and imprecise, hindering timely clinical decision-making. In contrast, whole genome sequencing offers a faster, more accurate alternative for AMR detection. A novel machine learning study leveraging whole genomic sequencing data to predict the phenotypic susceptibility of Escherichia coli to ciprofloxacin is presented. Using a novel dataset of 256 bacterial genomes and related susceptibility data, features were generated based on AMRFinderPlus findings and k-mer frequencies. The machine learning models, Random Forest and XGBoost, were evaluated using a five-fold cross-validation approach. Results showed that combining AMRFinderPlus and k-mer frequency features could achieve more than 90% accuracy using the XGBoost gradient boosting model. These findings suggest that the best results may be achieved using reference-free features combined with known gene markers.


Assuntos
Antibacterianos , Escherichia coli , Aprendizado de Máquina , Escherichia coli/efeitos dos fármacos , Escherichia coli/genética , Antibacterianos/farmacologia , Testes de Sensibilidade Microbiana , Farmacorresistência Bacteriana/genética , Ciprofloxacina/farmacologia , Ciprofloxacina/uso terapêutico , Sequenciamento Completo do Genoma , Humanos
2.
Mutat Res Rev Mutat Res ; 794: 108509, 2024 Jul 06.
Artigo em Inglês | MEDLINE | ID: mdl-38977176

RESUMO

Autism spectrum disorder (ASD) is a complex neurodevelopmental disorder (NDD) influenced by genetic, epigenetic, and environmental factors. Recent advancements in genomic analysis have shed light on numerous genes associated with ASD, highlighting the significant role of both common and rare genetic mutations, as well as copy number variations (CNVs), single nucleotide polymorphisms (SNPs) and unique de novo variants. These genetic variations disrupt neurodevelopmental pathways, contributing to the disorder's complexity. Notably, CNVs are present in 10 %-20 % of individuals with autism, with 3 %-7 % detectable through cytogenetic methods. While the role of submicroscopic CNVs in ASD has been recently studied, their association with genomic loci and genes has not been thoroughly explored. In this review, we focus on 47 CNV regions linked to ASD, encompassing 1632 genes, including protein-coding genes and long non-coding RNAs (lncRNAs), of which 659 show significant brain expression. Using a list of ASD-associated genes from SFARI, we detect 17 regions harboring at least one known ASD-related protein-coding gene. Of the remaining 30 regions, we identify 24 regions containing at least one protein-coding gene with brain-enriched expression and a nervous system phenotype in mouse mutants, and one lncRNA with both brain-enriched expression and upregulation in iPSC to neuron differentiation. This review not only expands our understanding of the genetic diversity associated with ASD but also underscores the potential of lncRNAs in contributing to its etiology. Additionally, the discovered CNVs will be a valuable resource for future diagnostic, therapeutic, and research endeavors aimed at prioritizing genetic variations in ASD.

3.
Front Bioeng Biotechnol ; 12: 1375626, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-39070163

RESUMO

DNA sequences of nearly any desired composition, length, and function can be synthesized to alter the biology of an organism for purposes ranging from the bioproduction of therapeutic compounds to invasive pest control. Yet despite offering many great benefits, engineered DNA poses a risk due to their possible misuse or abuse by malicious actors, or their unintentional introduction into the environment. Monitoring the presence of engineered DNA in biological or environmental systems is therefore crucial for routine and timely detection of emerging biological threats, and for improving public acceptance of genetic technologies. To address this, we developed Synsor, a tool for identifying engineered DNA sequences in high-throughput sequencing data. Synsor leverages the k-mer signature differences between naturally occurring and engineered DNA sequences and uses an artificial neural network to classify whether a DNA sequence is natural or engineered. By querying suspected sequences against the model, Synsor can identify sequences that are likely to have been engineered. Using natural plasmid and engineered vector sequences, we showed that Synsor identifies engineered DNA with >99% accuracy. We demonstrate how Synsor can be used to detect potential genetically engineered organisms and locate where engineered DNA is being introduced into the environment by analysing genomic and metagenomic data from yeast and wastewater samples, respectively. Synsor is therefore a powerful tool that will streamline the process of identifying engineered DNA in poorly characterized biological or environmental systems, thereby allowing for enhanced monitoring of emerging biological threats.

4.
Gigascience ; 132024 01 02.
Artigo em Inglês | MEDLINE | ID: mdl-38837943

RESUMO

Genomic information is increasingly used to inform medical treatments and manage future disease risks. However, any personal and societal gains must be carefully balanced against the risk to individuals contributing their genomic data. Expanding our understanding of actionable genomic insights requires researchers to access large global datasets to capture the complexity of genomic contribution to diseases. Similarly, clinicians need efficient access to a patient's genome as well as population-representative historical records for evidence-based decisions. Both researchers and clinicians hence rely on participants to consent to the use of their genomic data, which in turn requires trust in the professional and ethical handling of this information. Here, we review existing and emerging solutions for secure and effective genomic information management, including storage, encryption, consent, and authorization that are needed to build participant trust. We discuss recent innovations in cloud computing, quantum-computing-proof encryption, and self-sovereign identity. These innovations can augment key developments from within the genomics community, notably GA4GH Passports and the Crypt4GH file container standard. We also explore how decentralized storage as well as the digital consenting process can offer culturally acceptable processes to encourage data contributions from ethnic minorities. We conclude that the individual and their right for self-determination needs to be put at the center of any genomics framework, because only on an individual level can the received benefits be accurately balanced against the risk of exposing private information.


Assuntos
Genômica , Humanos , Genômica/métodos , Genômica/ética , Segurança Computacional , Computação em Nuvem , Consentimento Livre e Esclarecido
5.
Stud Health Technol Inform ; 310: 770-774, 2024 Jan 25.
Artigo em Inglês | MEDLINE | ID: mdl-38269913

RESUMO

With the advancement of genomic engineering and genetic modification techniques, the uptake of computational tools to design guide RNA increased drastically. Searching for genomic targets to design guides with maximum on-target activity (efficiency) and minimum off-target activity (specificity) is now an essential part of genome editing experiments. Today, a variety of tools exist that allow the search of genomic targets and let users customize their search parameters to better suit their experiments. Here we present an overview of different ways to visualize these searched CRISPR target sites along with specific downstream information like primer design, restriction enzyme activity and mutational outcome prediction after a double-stranded break. We discuss the importance of a good visualization summary to interpret information along with different ways to represent similar information effectively.


Assuntos
Sistemas CRISPR-Cas , Visualização de Dados , RNA Guia de Sistemas CRISPR-Cas , Engenharia , Genômica
6.
Stud Health Technol Inform ; 310: 810-814, 2024 Jan 25.
Artigo em Inglês | MEDLINE | ID: mdl-38269921

RESUMO

Genetic data is limited and generating new datasets is often an expensive, time-consuming process, involving countless moving parts to genotype and phenotype individuals. While sharing data is beneficial for quality control and software development, privacy and security are of utmost importance. Generating synthetic data is a practical solution to mitigate the cost, time and sensitivities that hamper developers and researchers in producing and validating novel biotechnological solutions to data intensive problems. Existing methods focus on mutation frequencies at specific loci while ignoring epistatic interactions. Alternatively, programs that do consider epistasis are limited to two-way interactions or apply genomic constraints that make synthetic data generation arduous or computationally intensive. To solve this, we developed Polygenic Epistatic Phenotype Simulator (PEPS). Our tool is a probabilistic model that can generate synthetic phenotypes with a controllable level of complexity.


Assuntos
Biotecnologia , Modelos Estatísticos , Humanos , Simulação por Computador , Fenótipo , Genótipo
7.
Stud Health Technol Inform ; 310: 820-824, 2024 Jan 25.
Artigo em Inglês | MEDLINE | ID: mdl-38269923

RESUMO

Healthcare data is a scarce resource and access is often cumbersome. While medical software development would benefit from real datasets, the privacy of the patients is held at a higher priority. Realistic synthetic healthcare data can fill this gap by providing a dataset for quality control while at the same time preserving the patient's anonymity and privacy. Existing methods focus on American or European patient healthcare data but none is exclusively focused on the Australian population. Australia is a highly diverse country that has a unique healthcare system. To overcome this problem, we used a popular publicly available tool, Synthea, to generate disease progressions based on the Australian population. With this approach, we were able to generate 100,000 patients following Queensland (Australia) demographics.


Assuntos
Instalações de Saúde , Privacidade , Humanos , Austrália , Queensland , Progressão da Doença
8.
Stud Health Technol Inform ; 310: 1021-1025, 2024 Jan 25.
Artigo em Inglês | MEDLINE | ID: mdl-38269969

RESUMO

Coronary artery disease (CAD) has the highest disease burden worldwide. To manage this burden, predictive models are required to screen patients for preventative treatment. A range of variables have been explored for their capacity to predict disease, including phenotypic (age, sex, BMI and smoking status), medical imaging (carotid artery thickness) and genotypic. We use a machine learning models and the UK Biobank cohort to measure the prediction capacity of these 3 variable categories, both in combination and isolation. We demonstrate that phenotypic variables from the Framingham risk score have the best prediction capacity, although a combination of phenotypic, medical imaging and genotypic variables deliver the most specific models. Furthermore, we demonstrate that Variant Spark, a random forest based GWAS platform, performs effective feature selection for SNP-based genotype variables, identifying 115 significantly associated SNPs to the CAD phenotype.


Assuntos
Doença da Artéria Coronariana , Humanos , Doença da Artéria Coronariana/diagnóstico por imagem , Doença da Artéria Coronariana/genética , Espessura Intima-Media Carotídea , Fenótipo , Genótipo , Aprendizado de Máquina
9.
PLoS One ; 18(10): e0292924, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-37847697

RESUMO

Genome editing through the development of CRISPR (Clustered Regularly Interspaced Short Palindromic Repeat)-Cas technology has revolutionized many fields in biology. Beyond Cas9 nucleases, Cas12a (formerly Cpf1) has emerged as a promising alternative to Cas9 for editing AT-rich genomes. Despite the promises, guide RNA efficiency prediction through computational tools search still lacks accuracy. Through a computational meta-analysis, here we report that Cas12a target and off-target cleavage behavior are a factor of nucleotide bias combined with nucleotide mismatches relative to the protospacer adjacent motif (PAM) site. These features helped to train a Random Forest machine learning model to improve the accuracy by at least 15% over existing algorithms to predict guide RNA efficiency for the Cas12a enzyme. Despite the progresses, our report underscores the need for more representative datasets and further benchmarking to reliably and accurately predict guide RNA efficiency and off-target effects for Cas12a enzymes.


Assuntos
Sistemas CRISPR-Cas , Edição de Genes , Sistemas CRISPR-Cas/genética , Endonucleases/genética , RNA , Nucleotídeos
10.
Sci Rep ; 13(1): 17662, 2023 10 17.
Artigo em Inglês | MEDLINE | ID: mdl-37848535

RESUMO

Alzheimer's disease (AD) is a complex genetic disease, and variants identified through genome-wide association studies (GWAS) explain only part of its heritability. Epistasis has been proposed as a major contributor to this 'missing heritability', however, many current methods are limited to only modelling additive effects. We use VariantSpark, a machine learning approach to GWAS, and BitEpi, a tool for epistasis detection, to identify AD associated variants and interactions across two independent cohorts, ADNI and UK Biobank. By incorporating significant epistatic interactions, we captured 10.41% more phenotypic variance than logistic regression (LR). We validate the well-established AD loci, APOE, and identify two novel genome-wide significant AD associated loci in both cohorts, SH3BP4 and SASH1, which are also in significant epistatic interactions with APOE. We show that the SH3BP4 SNP has a modulating effect on the known pathogenic APOE SNP, demonstrating a possible protective mechanism against AD. SASH1 is involved in a triplet interaction with pathogenic APOE SNP and ACOT11, where the SASH1 SNP lowered the pathogenic interaction effect between ACOT11 and APOE. Finally, we demonstrate that VariantSpark detects disease associations with 80% fewer controls than LR, unlocking discoveries in well annotated but smaller cohorts.


Assuntos
Doença de Alzheimer , Humanos , Doença de Alzheimer/genética , Estudo de Associação Genômica Ampla , Epistasia Genética , Aprendizado de Máquina , Polimorfismo de Nucleotídeo Único , Apolipoproteínas E/genética , Predisposição Genética para Doença , Proteínas Adaptadoras de Transdução de Sinal/genética
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA