Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 2 de 2
Filter
Add more filters










Database
Language
Publication year range
1.
Bioinformatics ; 34(1): 80-87, 2018 01 01.
Article in English | MEDLINE | ID: mdl-28968638

ABSTRACT

Motivation: Despite significant efforts in expert curation, clinical relevance about most of the 154 million dbSNP reference variants (RS) remains unknown. However, a wealth of knowledge about the variant biological function/disease impact is buried in unstructured literature data. Previous studies have attempted to harvest and unlock such information with text-mining techniques but are of limited use because their mutation extraction results are not standardized or integrated with curated data. Results: We propose an automatic method to extract and normalize variant mentions to unique identifiers (dbSNP RSIDs). Our method, in benchmarking results, demonstrates a high F-measure of ∼90% and compared favorably to the state of the art. Next, we applied our approach to the entire PubMed and validated the results by verifying that each extracted variant-gene pair matched the dbSNP annotation based on mapped genomic position, and by analyzing variants curated in ClinVar. We then determined which text-mined variants and genes constituted novel discoveries. Our analysis reveals 41 889 RS numbers (associated with 9151 genes) not found in ClinVar. Moreover, we obtained a rich set worth further review: 12 462 rare variants (MAF ≤ 0.01) in 3849 genes which are presumed to be deleterious and not frequently found in the general population. To our knowledge, this is the first large-scale study to analyze and integrate text-mined variant data with curated knowledge in existing databases. Our results suggest that databases can be significantly enriched by text mining and that the combined information can greatly assist human efforts in evaluating/prioritizing variants in genomic research. Availability and implementation: The tmVar 2.0 source code and corpus are freely available at https://www.ncbi.nlm.nih.gov/research/bionlp/Tools/tmvar/. Contact: zhiyong.lu@nih.gov.


Subject(s)
Data Mining/methods , Mutation , Polymorphism, Genetic , Precision Medicine/methods , Software , Data Curation , Databases, Factual , Genetic Predisposition to Disease , Genomics/methods , Humans , Phenotype , PubMed , Publications
2.
PLoS One ; 12(12): e0187771, 2017.
Article in English | MEDLINE | ID: mdl-29194460

ABSTRACT

Quantitative relationship between the magnitude of variation in minor histocompatibility antigens (mHA) and graft versus host disease (GVHD) pathophysiology in stem cell transplant (SCT) donor-recipient pairs (DRP) is not established. In order to elucidate this relationship, whole exome sequencing (WES) was performed on 27 HLA matched related (MRD), & 50 unrelated donors (URD), to identify nonsynonymous single nucleotide polymorphisms (SNPs). An average 2,463 SNPs were identified in MRD, and 4,287 in URD DRP (p<0.01); resulting peptide antigens that may be presented on HLA class I molecules in each DRP were derived in silico (NetMHCpan ver2.0) and the tissue expression of proteins these were derived from determined (GTex). MRD DRP had an average 3,670 HLA-binding-alloreactive peptides, putative mHA (pmHA) with an IC50 of <500 nM, and URD, had 5,386 (p<0.01). To simulate an alloreactive donor cytotoxic T cell response, the array of pmHA in each patient was considered as an operator matrix modifying a hypothetical cytotoxic T cell clonal vector matrix; each responding T cell clone's proliferation was determined by the logistic equation of growth, accounting for HLA binding affinity and tissue expression of each alloreactive peptide. The resulting simulated organ-specific alloreactive T cell clonal growth revealed marked variability, with the T cell count differences spanning orders of magnitude between different DRP. Despite an estimated, uniform set of constants used in the model for all DRP, and a heterogeneously treated group of patients, higher total and organ-specific T cell counts were associated with cumulative incidence of moderate to severe GVHD in recipients. In conclusion, exome wide sequence differences and the variable alloreactive peptide binding to HLA in each DRP yields a large range of possible alloreactive donor T cell responses. Our findings also help understand the apparent randomness observed in the development of alloimmune responses.


Subject(s)
Cell Transplantation , Exome Sequencing , Models, Theoretical , Peptides/immunology , Stem Cell Transplantation , T-Lymphocytes/immunology , Humans
SELECTION OF CITATIONS
SEARCH DETAIL
...