Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 13 de 13
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
AMIA Jt Summits Transl Sci Proc ; 2020: 561-568, 2020.
Artigo em Inglês | MEDLINE | ID: mdl-32477678

RESUMO

Chemical entity recognition is essential for indexing scientific literature in the MEDLINE database at the National Library of Medicine. However, the tool currently used to suggest terms for indexing, the Medical Text Indexer, was not originally conceived as a chemical recognition tool. It has instead been adapted to the task via its use of MetaMap and the addition of in-house patterns and rules. In order to develop a tool more suitable for chemical recognition, we have created a collection of 200 MEDLINE titles and abstracts annotated with genes, proteins, inorganic and organic chemicals, as well as other biological molecules. We use this collection to evaluate eleven chemical entity recognition systems, where we seek to identify a tool that effectively recognizes chemical entities for indexing and also performs well on chemical recognition beyond the indexing task. We observe the highest performance with a SciBERT ensemble.

2.
Life Sci Alliance ; 1(6): e201800146, 2018 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-30519677

RESUMO

The study of carnivorous plants can afford insight into their unique evolutionary adaptations and their interactions with prokaryotic and eukaryotic species. For Sarracenia (pitcher plants), we identified 64 quantitative trait loci (QTL) for insect-capture traits of the pitchers, providing the genetic basis for differences between the pitfall and lobster-trap strategies of insect capture. The linkage map developed here is based upon the F2 of a cross between Sarracenia rosea and Sarracenia psittacina; we mapped 437 single nucleotide polymorphism and simple sequence repeat markers. We measured pitcher traits which differ between S. rosea and S. psittacina, mapping 64 QTL for 17 pitcher traits; there are hot-spot locations where multiple QTL map near each other. There are epistatic interactions in many cases where there are multiple loci for a trait. The QTL map uncovered the genetic basis for the differences between pitfall- and lobster-traps, and the changes that occurred during the divergence of these species. The longevity and clonability of Sarracenia plants make the F2 mapping population a resource for mapping more traits and for phenotype-to-genotype studies.

3.
Sci Data ; 5: 180001, 2018 01 30.
Artigo em Inglês | MEDLINE | ID: mdl-29381145

RESUMO

Adverse drug reactions (ADRs), unintended and sometimes dangerous effects that a drug may have, are one of the leading causes of morbidity and mortality during medical care. To date, there is no structured machine-readable authoritative source of known ADRs. The United States Food and Drug Administration (FDA) partnered with the National Library of Medicine to create a pilot dataset containing standardised information about known adverse reactions for 200 FDA-approved drugs. The Structured Product Labels (SPLs), the documents FDA uses to exchange information about drugs and other products, were manually annotated for adverse reactions at the mention level to facilitate development and evaluation of text mining tools for extraction of ADRs from all SPLs. The ADRs were then normalised to the Unified Medical Language System (UMLS) and to the Medical Dictionary for Regulatory Activities (MedDRA). We present the curation process and the structure of the publicly available database SPL-ADR-200db containing 5,098 distinct ADRs. The database is available at https://bionlp.nlm.nih.gov/tac2017adversereactions/; the code for preparing and validating the data is available at https://github.com/lhncbc/fda-ars.


Assuntos
Rotulagem de Medicamentos , Efeitos Colaterais e Reações Adversas Relacionados a Medicamentos , Bases de Dados Factuais , Estados Unidos , United States Food and Drug Administration
4.
AMIA Annu Symp Proc ; 2018: 368-376, 2018.
Artigo em Inglês | MEDLINE | ID: mdl-30815076

RESUMO

Medication doses, one of the determining factors in medication safety and effectiveness, are present in the literature, but only in free-text form. We set out to determine if the systems developed for extracting drug prescription information from clinical text would yield comparable results on scientific literature and if sequence-to-sequence learning with neural networks could improve over the current state-of-the-art. We developed a collection of 694 PubMed Central documents annotated with drug dose information using the i2b2 schema. We found that less than half of the drug doses are present in the MEDLINE/PubMed abstracts, and full-text is needed to identify the other half. We identified the differences in the scope and formatting of drug dose information in the literature and clinical text, which require developing new dose extraction approaches. Finally, we achieved 83.9% recall, 87.2% precision and 85.5% F1 score in extracting complete drug prescription information from the literature.


Assuntos
Aprendizado Profundo , Armazenamento e Recuperação da Informação/métodos , Redes Neurais de Computação , Preparações Farmacêuticas/administração & dosagem , PubMed , Vias de Administração de Medicamentos , Esquema de Medicação , Cálculos da Dosagem de Medicamento , Humanos
5.
J Am Med Inform Assoc ; 24(4): 841-844, 2017 Jul 01.
Artigo em Inglês | MEDLINE | ID: mdl-28130331

RESUMO

MetaMap is a widely used named entity recognition tool that identifies concepts from the Unified Medical Language System Metathesaurus in text. This study presents MetaMap Lite, an implementation of some of the basic MetaMap functions in Java. On several collections of biomedical literature and clinical text, MetaMap Lite demonstrated real-time speed and precision, recall, and F1 scores comparable to or exceeding those of MetaMap and other popular biomedical text processing tools, clinical Text Analysis and Knowledge Extraction System (cTAKES) and DNorm.


Assuntos
Armazenamento e Recuperação da Informação/métodos , Processamento de Linguagem Natural , Software , Unified Medical Language System , Algoritmos
6.
PLoS One ; 10(8): e0134855, 2015.
Artigo em Inglês | MEDLINE | ID: mdl-26241739

RESUMO

We describe restriction site associated RNA sequencing (RARseq), an RNAseq-based genotype by sequencing (GBS) method. It includes the construction of RNAseq libraries from double stranded cDNA digested with selected restriction enzymes. To test this, we constructed six single- and six-dual-digested RARseq libraries from six F2 pitcher plant individuals and sequenced them on a half of a Miseq run. On average, the de novo approach of population genome analysis detected 544 and 570 RNA SNPs, whereas the reference transcriptome-based approach revealed an average of 1907 and 1876 RNA SNPs per individual, from single- and dual-digested RARseq data, respectively. The average numbers of RNA SNPs and alleles per loci are 1.89 and 2.17, respectively. Our results suggest that the RARseq protocol allows good depth of coverage per loci for detecting RNA SNPs and polymorphic loci for population genomics and mapping analyses. In non-model systems where complete genomes sequences are not always available, RARseq data can be analyzed in reference to the transcriptome. In addition to enriching for functional markers, this method may prove particularly useful in organisms where the genomes are not favorable for DNA GBS.


Assuntos
Marcadores Genéticos , Técnicas de Genotipagem , Metagenômica/métodos , Análise de Sequência de RNA/métodos , Transcriptoma , DNA Complementar/genética , DNA de Plantas/genética , Biblioteca Gênica , Variação Genética , Haplótipos/genética , Hibridização Genética , Dados de Sequência Molecular , Polimorfismo de Nucleotídeo Único , RNA Mensageiro/genética , RNA de Plantas/genética , Mapeamento por Restrição , Sarraceniaceae/genética
7.
Am J Bot ; 102(6): 910-20, 2015 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-26101417

RESUMO

PREMISE OF THE STUDY: The sunflower genus Helianthus has long been recognized as economically significant, containing species of agricultural and horticultural importance. Additionally, this genus displays a large range of phenotypic and genetic variation, making Helianthus a useful system for studying evolutionary and ecological processes. Here we present the most robust Helianthus phylogeny to date, laying the foundation for future studies of this genus. METHODS: We used a target enrichment approach across 37 diploid Helianthus species/subspecies with a total of 103 accessions. This technique garnered 170 genes used for both coalescent and concatenation analyses. The resulting phylogeny was additionally used to examine the evolution of life history and growth form across the genus. KEY RESULTS: Coalescent and concatenation approaches were largely congruent, resolving a large annual clade and two large perennial clades. However, several relationships deeper within the phylogeny were more weakly supported and incongruent among analyses including the placement of H. agrestis, H. cusickii, H. gracilentus, H. mollis, and H. occidentalis. CONCLUSIONS: The current phylogeny supports three major clades including a large annual clade, a southeastern perennial clade, and another clade of primarily large-statured perennials. Relationships among taxa are more consistent with early phylogenies of the genus using morphological and crossing data than recent efforts using single genes, which highlight the difficulties of phylogenetic estimation in genera known for reticulate evolution. Additionally, conflict and low support at the base of the perennial clades may suggest a rapid radiation and/or ancient introgression within the genus.


Assuntos
Diploide , Helianthus/classificação , Helianthus/genética , Filogenia , Cloroplastos/genética , Etiquetas de Sequências Expressas , Genes de Plantas , Funções Verossimilhança , Especificidade da Espécie
8.
Mol Phylogenet Evol ; 85: 76-87, 2015 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-25689607

RESUMO

The North American carnivorous pitcher plant genus Sarracenia (Sarraceniaceae) is a relatively young clade (<3 million years ago) displaying a wide range of morphological diversity in complex trapping structures. This recently radiated group is a promising system to examine the structural evolution and diversification of carnivorous plants; however, little is known regarding evolutionary relationships within the genus. Previous attempts at resolving the phylogeny have been unsuccessful, most likely due to few parsimony-informative sites compounded by incomplete lineage sorting. Here, we applied a target enrichment approach using multiple accessions to assess the relationships of Sarracenia species. This resulted in 199 nuclear genes from 75 accessions covering the putative 8-11 species and 8 subspecies/varieties. In addition, we recovered 42kb of plastome sequence from each accession to estimate a cpDNA-derived phylogeny. Unsurprisingly, the cpDNA had few parsimony-informative sites (0.5%) and provided little information on species relationships. In contrast, use of the targeted nuclear loci in concatenation and coalescent frameworks elucidated many relationships within Sarracenia even with high heterogeneity among gene trees. Results were largely consistent for both concatenation and coalescent approaches. The only major disagreement was with the placement of the purpurea complex. Moreover, results suggest an Appalachian massif biogeographic origin of the genus. Overall, this study highlights the utility of target enrichment using multiple accessions to resolve relationships in recently radiated taxa.


Assuntos
Evolução Biológica , Filogenia , Sarraceniaceae/classificação , Núcleo Celular/genética , DNA de Cloroplastos/genética , DNA de Plantas/genética , Genes de Plantas , Funções Verossimilhança , Modelos Genéticos , Análise de Sequência de DNA
9.
DNA Res ; 18(4): 253-61, 2011 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-21676972

RESUMO

Sarracenia species (pitcher plants) are carnivorous plants which obtain a portion of their nutrients from insects captured in the pitchers. To investigate these plants, we sequenced the transcriptome of two species, Sarracenia psittacina and Sarracenia purpurea, using Roche 454 pyrosequencing technology. We obtained 46 275 and 36 681 contigs by de novo assembly methods for S. psittacina and S. purpurea, respectively, and further identified 16 163 orthologous contigs between them. Estimation of synonymous substitution rates between orthologous and paralogous contigs indicates the events of genome duplication and speciation within the Sarracenia genus both occurred ∼2 million years ago. The ratios of synonymous and non-synonymous substitution rates indicated that 491 contigs have been under positive selection (K(a)/K(s) > 1). Significant proportions of these contigs were involved in functions related to binding activity. We also found that the greatest sequence similarity for both of these species was to Vitis vinifera, which is most consistent with a non-current classification of the order Ericales as an asterid. This study has provided new insights into pitcher plants and will contribute greatly to future research on this genus and its distinctive ecological adaptations.


Assuntos
Perfilação da Expressão Gênica , Sarraceniaceae/genética , Biologia Computacional , Ecossistema , Duplicação Gênica , Genoma de Planta , Anotação de Sequência Molecular
10.
Conserv Genet Resour ; 2(1): 75-79, 2010 Dec 01.
Artigo em Inglês | MEDLINE | ID: mdl-21170168

RESUMO

Sarracenia species (pitcher plants) are carnivorous plants which obtain a portion of their nutrients from insects captured in the pitchers. Sarracenia species naturally hybridize with each other, and hybrid swarms have been identified. A number of the taxa within the genus are considered endangered. In order to facilitate evolutionary, ecological and conservation genetic analyses within the genus, we developed 25 microsatellite loci which show variability either within species or between species. Three S. purpurea populations were examined with 10 primer sets which showed within population variability.

11.
AMIA Annu Symp Proc ; : 960, 2006.
Artigo em Inglês | MEDLINE | ID: mdl-17238579

RESUMO

A JDI (Journal Descriptor Indexing) tool has been developed at NLM that automatically categorizes biomedical text as input, returning a ranked list, with scores between 0-1, of either JDs (Journal Descriptors, corresponding to biomedical disciplines) or STs (UMLS Semantic Types). Possible applications include WSD (Word Sense Disambiguation) and retrieval according to discipline. The Lexical Systems Group plans to distribute an open source JAVA version of this tool.


Assuntos
Indexação e Redação de Resumos/métodos , Processamento de Linguagem Natural , Medical Subject Headings , Publicações Periódicas como Assunto , Semântica , Unified Medical Language System
12.
J Am Soc Inf Sci Technol ; 57(1): 96-113, 2006 Jan 01.
Artigo em Inglês | MEDLINE | ID: mdl-19890434

RESUMO

An experiment was performed at the National Library of Medicine((R)) (NLM((R))) in word sense disambiguation (WSD) using the Journal Descriptor Indexing (JDI) methodology. The motivation is the need to solve the ambiguity problem confronting NLM's MetaMap system, which maps free text to terms corresponding to concepts in NLM's Unified Medical Language System((R)) (UMLS((R))) Metathesaurus((R)). If the text maps to more than one Metathesaurus concept at the same high confidence score, MetaMap has no way of knowing which concept is the correct mapping. We describe the JDI methodology, which is ultimately based on statistical associations between words in a training set of MEDLINE((R)) citations and a small set of journal descriptors (assigned by humans to journals per se) assumed to be inherited by the citations. JDI is the basis for selecting the best meaning that is correlated to UMLS semantic types (STs) assigned to ambiguous concepts in the Metathesaurus. For example, the ambiguity transport has two meanings: "Biological Transport" assigned the ST Cell Function and "Patient transport" assigned the ST Health Care Activity. A JDI-based methodology can analyze text containing transport and determine which ST receives a higher score for that text, which then returns the associated meaning, presumed to apply to the ambiguity itself. We then present an experiment in which a baseline disambiguation method was compared to four versions of JDI in disambiguating 45 ambiguous strings from NLM's WSD Test Collection. Overall average precision for the highest-scoring JDI version was 0.7873 compared to 0.2492 for the baseline method, and average precision for individual ambiguities was greater than 0.90 for 23 of them (51%), greater than 0.85 for 24 (53%), and greater than 0.65 for 35 (79%). On the basis of these results, we hope to improve performance of JDI and test its use in applications.

13.
Stud Health Technol Inform ; 107(Pt 1): 268-72, 2004.
Artigo em Inglês | MEDLINE | ID: mdl-15360816

RESUMO

The Medical Text Indexer (MTI) is a program for producing MeSH indexing recommendations. It is the major product of NLM's Indexing Initiative and has been used in both semi-automated and fully automated indexing environments at the Library since mid 2002. We report here on an experiment conducted with MEDLINE indexers to evaluate MTI's performance and to generate ideas for its improvement as a tool for user-assisted indexing. We also discuss some filtering techniques developed to improve MTI's accuracy for use primarily in automatically producing the indexing for several abstracts collections.


Assuntos
Indexação e Redação de Resumos/métodos , Medical Subject Headings , Processamento de Linguagem Natural , MEDLINE , National Library of Medicine (U.S.) , Unified Medical Language System , Estados Unidos
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...