Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 5 de 5
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
BMC Bioinformatics ; 20(1): 429, 2019 Aug 16.
Artigo em Inglês | MEDLINE | ID: mdl-31419935

RESUMO

BACKGROUND: Diagnosis and treatment decisions in cancer increasingly depend on a detailed analysis of the mutational status of a patient's genome. This analysis relies on previously published information regarding the association of variations to disease progression and possible interventions. Clinicians to a large degree use biomedical search engines to obtain such information; however, the vast majority of scientific publications focus on basic science and have no direct clinical impact. We develop the Variant-Information Search Tool (VIST), a search engine designed for the targeted search of clinically relevant publications given an oncological mutation profile. RESULTS: VIST indexes all PubMed abstracts and content from ClinicalTrials.gov. It applies advanced text mining to identify mentions of genes, variants and drugs and uses machine learning based scoring to judge the clinical relevance of indexed abstracts. Its functionality is available through a fast and intuitive web interface. We perform several evaluations, showing that VIST's ranking is superior to that of PubMed or a pure vector space model with regard to the clinical relevance of a document's content. CONCLUSION: Different user groups search repositories of scientific publications with different intentions. This diversity is not adequately reflected in the standard search engines, often leading to poor performance in specialized settings. We develop a search engine for the specific case of finding documents that are clinically relevant in the course of cancer treatment. We believe that the architecture of our engine, heavily relying on machine learning algorithms, can also act as a blueprint for search engines in other, equally specific domains. VIST is freely available at https://vist.informatik.hu-berlin.de/.


Assuntos
Neoplasias/patologia , Medicina de Precisão , Ferramenta de Busca , Algoritmos , Bases de Dados como Assunto , Documentação , Humanos , Internet , Interface Usuário-Computador
2.
Artigo em Inglês | MEDLINE | ID: mdl-32914021

RESUMO

PURPOSE: Precision oncology depends on the availability of up-to-date, comprehensive, and accurate information about associations between genetic variants and therapeutic options. Recently, a number of knowledge bases (KBs) have been developed that gather such information on the basis of expert curation of the scientific literature. We performed a quantitative and qualitative comparison of Clinical Interpretations of Variants in Cancer, OncoKB, Cancer Gene Census, Database of Curated Mutations, CGI Biomarkers (the cancer genome interpreter biomarker database), Tumor Alterations Relevant for Genomics-Driven Therapy, and the Precision Medicine Knowledge Base. METHODS: We downloaded each KB and restructured their content to describe variants, genes, drugs, and gene-drug associations in a common format. We normalized gene names to Entrez Gene IDs and drug names to ChEMBL and DrugBank IDs. For the analysis of clinically relevant gene-drug associations, we obtained lists of genes affected by genetic alterations and putative drug therapies for 113 patients with cancer whose cases were presented at the Molecular Tumor Board (MTB) of the Charité Comprehensive Cancer Center. RESULTS: Our analysis revealed that the KBs are largely overlapping but also that each source harbors a notable amount of unique information. Although some KBs cover more genes, others contain more data about gene-drug associations. Retrospective comparisons with findings of the Charitè MTB at the gene level showed that use of multiple KBs may considerably improve retrieval results. The relative importance of a KB in terms of cancer genes was assessed in more detail by logistic regression, which revealed that all but one source had a notable impact on result quality. We confirmed these findings using a second data set obtained from an independent MTB. CONCLUSION: To date, none of the existing publicly available KBs on gene-drug associations in precision oncology fully subsumes the others, but all of them exhibit specific strengths and weaknesses. Consideration of multiple KBs, therefore, is essential to obtain comprehensive results.

3.
Bioinformatics ; 33(14): i37-i48, 2017 Jul 15.
Artigo em Inglês | MEDLINE | ID: mdl-28881963

RESUMO

MOTIVATION: Text mining has become an important tool for biomedical research. The most fundamental text-mining task is the recognition of biomedical named entities (NER), such as genes, chemicals and diseases. Current NER methods rely on pre-defined features which try to capture the specific surface properties of entity types, properties of the typical local context, background knowledge, and linguistic information. State-of-the-art tools are entity-specific, as dictionaries and empirically optimal feature sets differ between entity types, which makes their development costly. Furthermore, features are often optimized for a specific gold standard corpus, which makes extrapolation of quality measures difficult. RESULTS: We show that a completely generic method based on deep learning and statistical word embeddings [called long short-term memory network-conditional random field (LSTM-CRF)] outperforms state-of-the-art entity-specific NER tools, and often by a large margin. To this end, we compared the performance of LSTM-CRF on 33 data sets covering five different entity classes with that of best-of-class NER tools and an entity-agnostic CRF implementation. On average, F1-score of LSTM-CRF is 5% above that of the baselines, mostly due to a sharp increase in recall. AVAILABILITY AND IMPLEMENTATION: The source code for LSTM-CRF is available at https://github.com/glample/tagger and the links to the corpora are available at https://corposaurus.github.io/corpora/ . CONTACT: habibima@informatik.hu-berlin.de.


Assuntos
Mineração de Dados/métodos , Aprendizado de Máquina , Animais , Humanos , Camundongos , Software
4.
J Cheminform ; 8: 59, 2016.
Artigo em Inglês | MEDLINE | ID: mdl-27843493

RESUMO

Recently, methods for Chemical Named Entity Recognition (NER) have gained substantial interest, driven by the need for automatically analyzing todays ever growing collections of biomedical text. Chemical NER for patents is particularly essential due to the high economic importance of pharmaceutical findings. However, NER on patents has essentially been neglected by the research community for long, mostly because of the lack of enough annotated corpora. A recent international competition specifically targeted this task, but evaluated tools only on gold standard patent abstracts instead of full patents; furthermore, results from such competitions are often difficult to extrapolate to real-life settings due to the relatively high homogeneity of training and test data. Here, we evaluate the two state-of-the-art chemical NER tools, tmChem and ChemSpot, on four different annotated patent corpora, two of which consist of full texts. We study the overall performance of the tools, compare their results at the instance level, report on high-recall and high-precision ensembles, and perform cross-corpus and intra-corpus evaluations. Our findings indicate that full patents are considerably harder to analyze than patent abstracts and clearly confirm the common wisdom that using the same text genre (patent vs. scientific) and text type (abstract vs. full text) for training and testing is a pre-requisite for achieving high quality text mining results.

5.
Nat Commun ; 6: 7773, 2015 Jul 16.
Artigo em Inglês | MEDLINE | ID: mdl-26178622

RESUMO

GTPases act as key regulators of many cellular processes by switching between active (GTP-bound) and inactive (GDP-bound) states. In many cases, understanding their mode of action has been aided by artificially stabilizing one of these states either by designing mutant proteins or by complexation with non-hydrolysable GTP analogues. Because of inherent disadvantages in these approaches, we have developed acryl-bearing GTP and GDP derivatives that can be covalently linked with strategically placed cysteines within the GTPase of interest. Binding studies with GTPase-interacting proteins and X-ray crystallography analysis demonstrate that the molecular properties of the covalent GTPase-acryl-nucleotide adducts are a faithful reflection of those of the corresponding native states and are advantageously permanently locked in a defined nucleotide (that is active or inactive) state. In a first application, in vivo experiments using covalently locked Rab5 variants provide new insights into the mechanism of correct intracellular localization of Rab proteins.


Assuntos
Proteínas de Escherichia coli/metabolismo , Proteínas Fúngicas/metabolismo , Guanosina Difosfato/metabolismo , Guanosina Trifosfato/metabolismo , Proteínas rab de Ligação ao GTP/metabolismo , Sítios de Ligação , Cristalografia por Raios X , Escherichia coli , Proteínas de Escherichia coli/química , Proteínas Fúngicas/química , GTP Fosfo-Hidrolases/química , GTP Fosfo-Hidrolases/metabolismo , Guanosina Difosfato/química , Guanosina Trifosfato/química , Ligação Proteica , Proteínas rab de Ligação ao GTP/química
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...