Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 4 de 4
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
Methods Inf Med ; 60(5-06): 147-161, 2021 12.
Artigo em Inglês | MEDLINE | ID: mdl-34719010

RESUMO

BACKGROUND: Patient safety event reports provide valuable insight into systemic safety issues but deriving insights from these reports requires computational tools to efficiently parse through large volumes of qualitative data. Natural language processing (NLP) combined with predictive learning provides an automated approach to evaluating these data and supporting the work of patient safety analysts. OBJECTIVES: The objective of this study was to use NLP and machine learning techniques to develop a generalizable, scalable, and reliable approach to classifying event reports for the purpose of driving improvements in the safety and quality of patient care. METHODS: Datasets for 14 different labels (themes) were vectorized using a bag-of-words, tf-idf, or document embeddings approach and then applied to a series of classification algorithms via a hyperparameter grid search to derive an optimized model. Reports were also analyzed for terms strongly associated with each theme using an adjusted F-score calculation. RESULTS: F1 score for each optimized model ranged from 0.951 ("Fall") to 0.544 ("Environment"). The bag-of-words approach proved optimal for 12 of 14 labels, and the naïve Bayes algorithm performed best for nine labels. Linear support vector machine was demonstrated as optimal for three labels and XGBoost for four of the 14 labels. Labels with more distinctly associated terms performed better than less distinct themes, as shown by a Pearson's correlation coefficient of 0.634. CONCLUSIONS: We were able to demonstrate an analytical pipeline that broadly applies NLP and predictive modeling to categorize patient safety reports from multiple facilities. This pipeline allows analysts to more rapidly identify and structure information contained in patient safety data, which can enhance the evaluation and the use of this information over time.


Assuntos
Processamento de Linguagem Natural , Segurança do Paciente , Algoritmos , Teorema de Bayes , Humanos , Aprendizado de Máquina
2.
Mol Syst Biol ; 7: 519, 2011 Aug 02.
Artigo em Inglês | MEDLINE | ID: mdl-21811230

RESUMO

Natural and synthetic biological networks must function reliably in the face of fluctuating stoichiometry of their molecular components. These fluctuations are caused in part by changes in relative expression efficiency and the DNA template amount of the network-coding genes. Gene product levels could potentially be decoupled from these changes via built-in adaptation mechanisms, thereby boosting network reliability. Here, we show that a mechanism based on an incoherent feedforward motif enables adaptive gene expression in mammalian cells. We modeled, synthesized, and tested transcriptional and post-transcriptional incoherent loops and found that in all cases the gene product adapts to changes in DNA template abundance. We also observed that the post-transcriptional form results in superior adaptation behavior, higher absolute expression levels, and lower intrinsic fluctuations. Our results support a previously hypothesized endogenous role in gene dosage compensation for such motifs and suggest that their incorporation in synthetic networks will improve their robustness and reliability.


Assuntos
Dosagem de Genes , Redes Reguladoras de Genes , Moldes Genéticos , Fatores de Transcrição/genética , Bases de Dados Genéticas , Regulação da Expressão Gênica , Células HEK293 , Humanos , MicroRNAs/genética , MicroRNAs/metabolismo , Modelos Biológicos , Plasmídeos , Interferência de RNA , Biologia Sintética , Transfecção/métodos
3.
Bioinformatics ; 27(3): 408-15, 2011 Feb 01.
Artigo em Inglês | MEDLINE | ID: mdl-21138947

RESUMO

MOTIVATION: A major goal of biomedical research in personalized medicine is to find relationships between mutations and their corresponding disease phenotypes. However, most of the disease-related mutational data are currently buried in the biomedical literature in textual form and lack the necessary structure to allow easy retrieval and visualization. We introduce a high-throughput computational method for the identification of relevant disease mutations in PubMed abstracts applied to prostate (PCa) and breast cancer (BCa) mutations. RESULTS: We developed the extractor of mutations (EMU) tool to identify mutations and their associated genes. We benchmarked EMU against MutationFinder--a tool to extract point mutations from text. Our results show that both methods achieve comparable performance on two manually curated datasets. We also benchmarked EMU's performance for extracting the complete mutational information and phenotype. Remarkably, we show that one of the steps in our approach, a filter based on sequence analysis, increases the precision for that task from 0.34 to 0.59 (PCa) and from 0.39 to 0.61 (BCa). We also show that this high-throughput approach can be extended to other diseases. DISCUSSION: Our method improves the current status of disease-mutation databases by significantly increasing the number of annotated mutations. We found 51 and 128 mutations manually verified to be related to PCa and Bca, respectively, that are not currently annotated for these cancer types in the OMIM or Swiss-Prot databases. EMU's retrieval performance represents a 2-fold improvement in the number of annotated mutations for PCa and BCa. We further show that our method can benefit from full-text analysis once there is an increase in Open Access availability of full-text articles. AVAILABILITY: Freely available at: http://bioinf.umbc.edu/EMU/ftp.


Assuntos
Algoritmos , Biologia Computacional/métodos , Bases de Dados Factuais , Armazenamento e Recuperação da Informação/métodos , Mutação Puntual/genética , Publicações , Humanos , Neoplasias/genética , Reprodutibilidade dos Testes , Software
4.
Bioinformatics ; 26(19): 2458-9, 2010 Oct 01.
Artigo em Inglês | MEDLINE | ID: mdl-20685956

RESUMO

UNLABELLED: Domain mapping of disease mutations (DMDM) is a database in which each disease mutation can be displayed by its gene, protein or domain location. DMDM provides a unique domain-level view where all human coding mutations are mapped on the protein domain. To build DMDM, all human proteins were aligned to a database of conserved protein domains using a Hidden Markov Model-based sequence alignment tool (HMMer). The resulting protein-domain alignments were used to provide a domain location for all available human disease mutations and polymorphisms. The number of disease mutations and polymorphisms in each domain position are displayed alongside other relevant functional information (e.g. the binding and catalytic activity of the site and the conservation of that domain location). DMDM's protein domain view highlights molecular relationships among mutations from different diseases that might not be clearly observed with traditional gene-centric visualization tools. AVAILABILITY: Freely available at http://bioinf.umbc.edu/dmdm.


Assuntos
Biologia Computacional/métodos , Bases de Dados de Proteínas , Doença/genética , Mutação , Estrutura Terciária de Proteína/genética , Proteínas/genética , Humanos , Polimorfismo Genético , Alinhamento de Sequência
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...