Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 9 de 9
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
Res Sq ; 2024 Jul 02.
Artigo em Inglês | MEDLINE | ID: mdl-39011112

RESUMO

Critical evaluation of computational tools for predicting variant effects is important considering their increased use in disease diagnosis and driving molecular discoveries. In the sixth edition of the Critical Assessment of Genome Interpretation (CAGI) challenge, a dataset of 28 STK11 rare variants (27 missense, 1 single amino acid deletion), identified in primary non-small cell lung cancer biopsies, was experimentally assayed to characterize computational methods from four participating teams and five publicly available tools. Predictors demonstrated a high level of performance on key evaluation metrics, measuring correlation with the assay outputs and separating loss-of-function (LoF) variants from wildtype-like (WT-like) variants. The best participant model, 3Cnet, performed competitively with well-known tools. Unique to this challenge was that the functional data was generated with both biological and technical replicates, thus allowing the assessors to realistically establish maximum predictive performance based on experimental variability. Three out of the five publicly available tools and 3Cnet approached the performance of the assay replicates in separating LoF variants from WT-like variants. Surprisingly, REVEL, an often-used model, achieved a comparable correlation with the real-valued assay output as that seen for the experimental replicates. Performing variant interpretation by combining the new functional evidence with computational and population data evidence led to 16 new variants receiving a clinically actionable classification of likely pathogenic (LP) or likely benign (LB). Overall, the STK11 challenge highlights the utility of variant effect predictors in biomedical sciences and provides encouraging results for driving research in the field of computational genome interpretation.

2.
bioRxiv ; 2024 Jun 08.
Artigo em Inglês | MEDLINE | ID: mdl-38895200

RESUMO

Regular, systematic, and independent assessment of computational tools used to predict the pathogenicity of missense variants is necessary to evaluate their clinical and research utility and suggest directions for future improvement. Here, as part of the sixth edition of the Critical Assessment of Genome Interpretation (CAGI) challenge, we assess missense variant effect predictors (or variant impact predictors) on an evaluation dataset of rare missense variants from disease-relevant databases. Our assessment evaluates predictors submitted to the CAGI6 Annotate-All-Missense challenge, predictors commonly used by the clinical genetics community, and recently developed deep learning methods for variant effect prediction. To explore a variety of settings that are relevant for different clinical and research applications, we assess performance within different subsets of the evaluation data and within high-specificity and high-sensitivity regimes. We find strong performance of many predictors across multiple settings. Meta-predictors tend to outperform their constituent individual predictors; however, several individual predictors have performance similar to that of commonly used meta-predictors. The relative performance of predictors differs in high-specificity and high-sensitivity regimes, suggesting that different methods may be best suited to different use cases. We also characterize two potential sources of bias. Predictors that incorporate allele frequency as a predictive feature tend to have reduced performance when distinguishing pathogenic variants from very rare benign variants, and predictors supervised on pathogenicity labels from curated variant databases often learn label imbalances within genes. Overall, we find notable advances over the oldest and most cited missense variant effect predictors and continued improvements among the most recently developed tools, and the CAGI Annotate-All-Missense challenge (also termed the Missense Marathon) will continue to assess state-of-the-art methods as the field progresses. Together, our results help illuminate the current clinical and research utility of missense variant effect predictors and identify potential areas for future development.

3.
Hum Genomics ; 18(1): 28, 2024 Mar 21.
Artigo em Inglês | MEDLINE | ID: mdl-38509596

RESUMO

BACKGROUND: In the process of finding the causative variant of rare diseases, accurate assessment and prioritization of genetic variants is essential. Previous variant prioritization tools mainly depend on the in-silico prediction of the pathogenicity of variants, which results in low sensitivity and difficulty in interpreting the prioritization result. In this study, we propose an explainable algorithm for variant prioritization, named 3ASC, with higher sensitivity and ability to annotate evidence used for prioritization. 3ASC annotates each variant with the 28 criteria defined by the ACMG/AMP genome interpretation guidelines and features related to the clinical interpretation of the variants. The system can explain the result based on annotated evidence and feature contributions. RESULTS: We trained various machine learning algorithms using in-house patient data. The performance of variant ranking was assessed using the recall rate of identifying causative variants in the top-ranked variants. The best practice model was a random forest classifier that showed top 1 recall of 85.6% and top 3 recall of 94.4%. The 3ASC annotates the ACMG/AMP criteria for each genetic variant of a patient so that clinical geneticists can interpret the result as in the CAGI6 SickKids challenge. In the challenge, 3ASC identified causal genes for 10 out of 14 patient cases, with evidence of decreased gene expression for 6 cases. Among them, two genes (HDAC8 and CASK) had decreased gene expression profiles confirmed by transcriptome data. CONCLUSIONS: 3ASC can prioritize genetic variants with higher sensitivity compared to previous methods by integrating various features related to clinical interpretation, including features related to false positive risk such as quality control and disease inheritance pattern. The system allows interpretation of each variant based on the ACMG/AMP criteria and feature contribution assessed using explainable AI techniques.


Assuntos
Algoritmos , Doenças Raras , Humanos , Doenças Raras/diagnóstico , Doenças Raras/genética , Testes Genéticos , Aprendizado de Máquina , Variação Genética/genética , Histona Desacetilases/genética , Proteínas Repressoras/genética
4.
Front Genet ; 13: 729980, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-35368710

RESUMO

Infantile cerebellar-retinal degeneration (ICRD) is an extremely rare, infantile-onset neuro-degenerative disease, characterized by autosomal recessive inherited, global developmental delay (GDD), progressive cerebellar and cortical atrophy, and retinal degeneration. In 2012, a biallelic pathogenic variant in ACO2 gene (NM_001098.3) was found to be causative of this disease. To date, approximately 44 variants displaying various clinical features have been reported. Here, we report a case of two siblings with compound heterozygous variants in the ACO2 gene. Two siblings without perinatal problems were born to healthy non-consanguineous Korean parents. They showed GDD and seizures since infancy. Their first brain magnetic resonance imaging (MRI), electroencephalography, and metabolic workup revealed no abnormal findings. As they grew, they developed symptoms including ataxia, dysmetria, poor sitting balance, and myopia. Follow-up brain MRI findings revealed atrophy of the cerebellum and optic nerve. Through exome sequencing of both siblings and their parents, we identified the following compound heterozygous variants in the ACO2: c.85C > T (p.Arg29Trp) and c.2303C > A (p.Ala768Asp). These two variants were categorized as likely pathogenic based on ACMG/AMP guidelines. In conclusion, this case help to broaden the genetic and clinical spectrum of the ACO2 variants associated with ICRD. We have also documented the long-term clinical course and serial brain MRI findings for two patients with this extremely rare disease.

5.
Bioinformatics ; 37(24): 4626-4634, 2021 12 11.
Artigo em Inglês | MEDLINE | ID: mdl-34270679

RESUMO

MOTIVATION: Improvements in next-generation sequencing have enabled genome-based diagnosis for patients with genetic diseases. However, accurate interpretation of human variants requires knowledge from a number of clinical cases. In addition, manual analysis of each variant detected in a patient's genome requires enormous time and effort. To reduce the cost of diagnosis, various computational tools have been developed to predict the pathogenicity of human variants, but the shortage and bias of available clinical data can lead to overfitting of algorithms. RESULTS: We developed a pathogenicity predictor, 3Cnet, that uses recurrent neural networks to analyze the amino acid context of human variants. As 3Cnet is trained on simulated variants reflecting evolutionary conservation and clinical data, it can find disease-causing variants in patient genomes with 2.2 times greater sensitivity than currently available tools, more effectively discovering pathogenic variants and thereby improving diagnosis rates. AVAILABILITY AND IMPLEMENTATION: Codes (https://github.com/KyoungYeulLee/3Cnet/) and data (https://zenodo.org/record/4716879#.YIO-xqkzZH1) are freely available to non-commercial users. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Algoritmos , Software , Humanos , Virulência , Redes Neurais de Computação , Genoma Humano
6.
Genes (Basel) ; 10(11)2019 11 07.
Artigo em Inglês | MEDLINE | ID: mdl-31703452

RESUMO

In in-silico prediction for molecular binding of human genomes, promising results have been demonstrated by deep neural multi-task learning due to its strength in training tasks with imbalanced data and its ability to avoid over-fitting. Although the interrelation between tasks is known to be important for successful multi-task learning, its adverse effect has been underestimated. In this study, we used molecular interaction data of human targets from ChEMBL to train and test various multi-task and single-task networks and examined the effectiveness of multi-task learning for different compositions of targets. Targets were clustered based on sequence similarity in their binding domains and various target sets from clusters were chosen. By comparing the performance of deep neural architectures for each target set, we found that similarity within a target set is highly important for reliable multi-task learning. For a diverse target set or overall human targets, the performance of multi-task learning was lower than single-task learning, but outperformed single-task for the target set containing similar targets. From this insight, we developed Multiple Partial Multi-Task learning, which is suitable for binding prediction for human drug targets.


Assuntos
Aprendizado Profundo , Descoberta de Drogas/métodos , Bibliotecas de Moléculas Pequenas/farmacologia , Bases de Dados de Compostos Químicos , Humanos , Simulação de Acoplamento Molecular/métodos , Ligação Proteica , Bibliotecas de Moléculas Pequenas/química
7.
Am J Hum Genet ; 104(3): 439-453, 2019 03 07.
Artigo em Inglês | MEDLINE | ID: mdl-30773278

RESUMO

SPONASTRIME dysplasia is a rare, recessive skeletal dysplasia characterized by short stature, facial dysmorphism, and aberrant radiographic findings of the spine and long bone metaphysis. No causative genetic alterations for SPONASTRIME dysplasia have yet been determined. Using whole-exome sequencing (WES), we identified bi-allelic TONSL mutations in 10 of 13 individuals with SPONASTRIME dysplasia. TONSL is a multi-domain scaffold protein that interacts with DNA replication and repair factors and which plays critical roles in resistance to replication stress and the maintenance of genome integrity. We show here that cellular defects in dermal fibroblasts from affected individuals are complemented by the expression of wild-type TONSL. In addition, in vitro cell-based assays and in silico analyses of TONSL structure support the pathogenicity of those TONSL variants. Intriguingly, a knock-in (KI) Tonsl mouse model leads to embryonic lethality, implying the physiological importance of TONSL. Overall, these findings indicate that genetic variants resulting in reduced function of TONSL cause SPONASTRIME dysplasia and highlight the importance of TONSL in embryonic development and postnatal growth.


Assuntos
Fibroblastos/patologia , Genes Letais , Mutação , NF-kappa B/genética , Osteocondrodisplasias/patologia , Adolescente , Adulto , Animais , Células Cultivadas , Criança , Pré-Escolar , Dano ao DNA , Derme/metabolismo , Derme/patologia , Feminino , Fibroblastos/metabolismo , Humanos , Lactente , Recém-Nascido , Camundongos , Camundongos Endogâmicos C57BL , Osteocondrodisplasias/genética , Sequenciamento do Exoma/métodos , Adulto Jovem
8.
BMC Bioinformatics ; 18(Suppl 16): 567, 2017 12 28.
Artigo em Inglês | MEDLINE | ID: mdl-29297315

RESUMO

BACKGROUND: The identification of target molecules is important for understanding the mechanism of "target deconvolution" in phenotypic screening and "polypharmacology" of drugs. Because conventional methods of identifying targets require time and cost, in-silico target identification has been considered an alternative solution. One of the well-known in-silico methods of identifying targets involves structure activity relationships (SARs). SARs have advantages such as low computational cost and high feasibility; however, the data dependency in the SAR approach causes imbalance of active data and ambiguity of inactive data throughout targets. RESULTS: We developed a ligand-based virtual screening model comprising 1121 target SAR models built using a random forest algorithm. The performance of each target model was tested by employing the ROC curve and the mean score using an internal five-fold cross validation. Moreover, recall rates for top-k targets were calculated to assess the performance of target ranking. A benchmark model using an optimized sampling method and parameters was examined via external validation set. The result shows recall rates of 67.6% and 73.9% for top-11 (1% of the total targets) and top-33, respectively. We provide a website for users to search the top-k targets for query ligands available publicly at http://rfqsar.kaist.ac.kr . CONCLUSIONS: The target models that we built can be used for both predicting the activity of ligands toward each target and ranking candidate targets for a query ligand using a unified scoring scheme. The scores are additionally fitted to the probability so that users can estimate how likely a ligand-target interaction is active. The user interface of our web site is user friendly and intuitive, offering useful information and cross references.


Assuntos
Algoritmos , Sistemas de Liberação de Medicamentos , Modelos Teóricos , Relação Quantitativa Estrutura-Atividade , Simulação por Computador , Ligantes , Probabilidade , Curva ROC , Reprodutibilidade dos Testes
9.
Expert Opin Drug Discov ; 11(7): 707-15, 2016 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-27186904

RESUMO

INTRODUCTION: In contrast to traditional molecular docking, inverse or reverse docking is used for identifying receptors for a given ligand among a large number of receptors. Reverse docking can be used to discover new targets for existing drugs and natural compounds, explain polypharmacology and the molecular mechanism of a substance, find alternative indications of drugs through drug repositioning, and detecting adverse drug reactions and drug toxicity. AREAS COVERED: In this review, the authors examine how reverse docking methods have evolved over the past fifteen years and how they have been used for target identification and related applications for drug discovery. They discuss various aspects of target databases, reverse docking tools and servers. EXPERT OPINION: There are several issues related to reverse docking methods such as target structure dataset construction, computational efficiency, how to include receptor flexibility, and most importantly, how to properly normalize the docking scores. In order for reverse docking to become a truly useful tool for the drug discovery, these issues need to be adequately resolved.


Assuntos
Desenho de Fármacos , Descoberta de Drogas/métodos , Simulação de Acoplamento Molecular/métodos , Bases de Dados Factuais , Reposicionamento de Medicamentos , Efeitos Colaterais e Reações Adversas Relacionados a Medicamentos/diagnóstico , Humanos , Ligantes , Terapia de Alvo Molecular
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...