Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 5 de 5
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
J Biomed Semantics ; 9(1): 13, 2018 04 12.
Artigo em Inglês | MEDLINE | ID: mdl-29650041

RESUMO

BACKGROUND: Automatic identification of term variants or acceptable alternative free-text terms for gene and protein names from the millions of biomedical publications is a challenging task. Ontologies, such as the Cardiovascular Disease Ontology (CVDO), capture domain knowledge in a computational form and can provide context for gene/protein names as written in the literature. This study investigates: 1) if word embeddings from Deep Learning algorithms can provide a list of term variants for a given gene/protein of interest; and 2) if biological knowledge from the CVDO can improve such a list without modifying the word embeddings created. METHODS: We have manually annotated 105 gene/protein names from 25 PubMed titles/abstracts and mapped them to 79 unique UniProtKB entries corresponding to gene and protein classes from the CVDO. Using more than 14 M PubMed articles (titles and available abstracts), word embeddings were generated with CBOW and Skip-gram. We setup two experiments for a synonym detection task, each with four raters, and 3672 pairs of terms (target term and candidate term) from the word embeddings created. For Experiment I, the target terms for 64 UniProtKB entries were those that appear in the titles/abstracts; Experiment II involves 63 UniProtKB entries and the target terms are a combination of terms from PubMed titles/abstracts with terms (i.e. increased context) from the CVDO protein class expressions and labels. RESULTS: In Experiment I, Skip-gram finds term variants (full and/or partial) for 89% of the 64 UniProtKB entries, while CBOW finds term variants for 67%. In Experiment II (with the aid of the CVDO), Skip-gram finds term variants for 95% of the 63 UniProtKB entries, while CBOW finds term variants for 78%. Combining the results of both experiments, Skip-gram finds term variants for 97% of the 79 UniProtKB entries, while CBOW finds term variants for 81%. CONCLUSIONS: This study shows performance improvements for both CBOW and Skip-gram on a gene/protein synonym detection task by adding knowledge formalised in the CVDO and without modifying the word embeddings created. Hence, the CVDO supplies context that is effective in inducing term variability for both CBOW and Skip-gram while reducing ambiguity. Skip-gram outperforms CBOW and finds more pertinent term variants for gene/protein names annotated from the scientific literature.


Assuntos
Ontologias Biológicas , Doenças Cardiovasculares , Aprendizado Profundo , Doenças Cardiovasculares/genética , Doenças Cardiovasculares/metabolismo , Humanos , Anotação de Sequência Molecular , PubMed , Curva ROC
2.
Stud Health Technol Inform ; 235: 516-520, 2017.
Artigo em Inglês | MEDLINE | ID: mdl-28423846

RESUMO

We investigate the application of distributional semantics models for facilitating unsupervised extraction of biomedical terms from unannotated corpora. Term extraction is used as the first step of an ontology learning process that aims to (semi-)automatic annotation of biomedical concepts and relations from more than 300K PubMed titles and abstracts. We experimented with both traditional distributional semantics methods such as Latent Semantic Analysis (LSA) and Latent Dirichlet Allocation (LDA) as well as the neural language models CBOW and Skip-gram from Deep Learning. The evaluation conducted concentrates on sepsis, a major life-threatening condition, and shows that Deep Learning models outperform LSA and LDA with much higher precision.


Assuntos
Aprendizado de Máquina , PubMed , Semântica , Sepse , Humanos , Armazenamento e Recuperação da Informação , Processamento de Linguagem Natural
3.
J Biomed Semantics ; 7(1): 30, 2016 06 01.
Artigo em Inglês | MEDLINE | ID: mdl-27246819

RESUMO

BACKGROUND: The motivation for the BioHub project is to create an Integrated Knowledge Management System (IKMS) that will enable chemists to source ingredients from bio-renewables, rather than from non-sustainable sources such as fossil oil and its derivatives. METHOD: The BioHubKB is the data repository of the IKMS; it employs Semantic Web technologies, especially OWL, to host data about chemical transformations, bio-renewable feedstocks, co-product streams and their chemical components. Access to this knowledge base is provided to other modules within the IKMS through a set of RESTful web services, driven by SPARQL queries to a Sesame back-end. The BioHubKB re-uses several bio-ontologies and bespoke extensions, primarily for chemical feedstocks and products, to form its knowledge organisation schema. RESULTS: Parts of plants form feedstocks, while various processes generate co-product streams that contain certain chemicals. Both chemicals and transformations are associated with certain qualities, which the BioHubKB also attempts to capture. Of immediate commercial and industrial importance is to estimate the cost of particular sets of chemical transformations (leading to candidate surfactants) performed in sequence, and these costs too are captured. Data are sourced from companies' internal knowledge and document stores, and from the publicly available literature. Both text analytics and manual curation play their part in populating the ontology. We describe the prototype IKMS, the BioHubKB and the services that it supports for the IKMS. AVAILABILITY: The BioHubKB can be found via http://biohub.cs.manchester.ac.uk/ontology/biohub-kb.owl .


Assuntos
Conservação dos Recursos Naturais , Internet , Bases de Conhecimento , Semântica , Engenharia Química
4.
Int J Methods Psychiatr Res ; 19(4): 233-42, 2010 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-20799262

RESUMO

The Manchester Child Attachment Story Task (MCAST) is a representational technique for assessing attachment patterns of young school-age children. We have developed a computerised version (the CMCAST) in which story stems are represented on the computer by the movement of simple screen 'dolls'. This paper reports on a preliminary validation study of the CMCAST method against the MCAST. Fifty-five children completed the MCAST and CMCAST six weeks apart in random order. It proved possible to rate the CMCAST if a simplified form of the MCAST coding system was used. Inter-rater reliability was achieved for both versions (kappa = 0.93 for MCAST and kappa = 0.91 for CMCAST). Agreement between the MCAST and CMCAST ratings of attachment security was kappa = 0.67. Costs for the MCAST and CMCAST were comparable. A school-based feasibility study of 86 children suggested that the CMCAST was acceptable and could be administered with up to five children simultaneously. This preliminary study suggests that the CMCAST can reliably reproduce a simplified form of MCAST coding. The computer format may be well adapted to some uses such as screening for large-scale epidemiological research.


Assuntos
Apego ao Objeto , Determinação da Personalidade/normas , Desenvolvimento da Personalidade , Ajustamento Social , Temperamento , Adolescente , Criança , Feminino , Humanos , Masculino , Jogos e Brinquedos/psicologia , Técnicas Projetivas/normas , Técnicas Projetivas/estatística & dados numéricos , Psicometria , Reprodutibilidade dos Testes
5.
Int J Methods Psychiatr Res ; 15(4): 207-14, 2006.
Artigo em Inglês | MEDLINE | ID: mdl-17266016

RESUMO

Story stem measures allow the assessment of children's representations of relationship functioning, but are expensive and time-consuming to administer. We developed a computerized story stem measure which does not require specific training for administrators and which allows the child to produce their own animated, narrated story completion. This paper describes, firstly, the reliability of the Computerized MacArthur Story Stem Battery (CMSSB) and, secondly, a preliminary comparison of children in foster care and school controls on narrative coherence, intentionality and avoidance. The CMSSB showed good inter-rater reliability. A group of children in foster care showed significantly poorer coherence of narrative, less intentionality and greater avoidance on the CMSSB compared to a school comparison group.


Assuntos
Comportamento Infantil/psicologia , Desenho Assistido por Computador , Narração , Desenvolvimento da Personalidade , Percepção Social , Criança , Pré-Escolar , Compreensão/fisiologia , Feminino , Humanos , Masculino , Projetos Piloto , Reprodutibilidade dos Testes , Comportamento Verbal
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...