Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 5 de 5
Filter
Add more filters










Database
Language
Publication year range
1.
Rev Socionetwork Strateg ; 18(1): 27-47, 2024.
Article in English | MEDLINE | ID: mdl-38646588

ABSTRACT

We summarize the 10th Competition on Legal Information Extraction and Entailment. In this tenth edition, the competition included four tasks on case law and statute law. The case law component includes an information retrieval task (Task 1), and the confirmation of an entailment relation between an existing case and a selected unseen case (Task 2). The statute law component includes an information retrieval task (Task 3), and an entailment/question-answering task based on retrieved civil code statutes (Task 4). Participation was open to any group based on any approach. Ten different teams participated in the case law competition tasks, most of them in more than one task. We received results from 8 teams for Task 1 (22 runs) and seven teams for Task 2 (18 runs). On the statute law task, there were 9 different teams participating, most in more than one task. 6 teams submitted a total of 16 runs for Task 3, and 9 teams submitted a total of 26 runs for Task 4. We describe the variety of approaches, our official evaluation, and analysis of our data and submission results.

2.
J Chem Inf Model ; 63(21): 6619-6628, 2023 11 13.
Article in English | MEDLINE | ID: mdl-37859303

ABSTRACT

There is a pressing need for the automated extraction of chemical reaction information because of the rapid growth of scientific documents. The previously reported works in the literature for the procedure extraction either (a) did not consider the semantic relations between the action and argument or (b) defined a detailed schema for the extraction. The former method was insufficient for reproducing the reaction, while the latter methods were too specific to their own schema and did not consider the general semantic relation between the verb and argument. In addition, they did not provide an annotated text that aligned with the structured procedure. Along these lines, in this work, we propose a corpus named organic synthesis procedures with argument roles (OSPAR) that is annotated with rolesets to consider the semantic relation between the verb and argument. We also provide rolesets for chemical reactions, especially for organic synthesis, which represent the argument roles of actions in the corpus. More specifically, we annotated 112 organic synthesis procedures in journal articles from Organic Syntheses and defined 19 new rolesets in addition to 29 rolesets from an existing language resource (Proposition Bank). After that, we constructed a simple deep learning system trained on OSPAR and discussed the usefulness of the corpus by comparing it with chemical description language (XDL) generated by a natural language processing tool, namely, SynthReader. While our system's output required more detailed parsing, it covered comparable information against XDL. Moreover, we confirmed that the validation of the output action sequence was easy as it was aligned with the original text.


Subject(s)
Environmental Monitoring , Language , Semantics , Natural Language Processing
3.
Beilstein J Nanotechnol ; 6: 1872-82, 2015.
Article in English | MEDLINE | ID: mdl-26665057

ABSTRACT

To support nanocrystal device development, we have been working on a computational framework to utilize information in research papers on nanocrystal devices. We developed an annotated corpus called " NaDev" (Nanocrystal Device Development) for this purpose. We also proposed an automatic information extraction system called "NaDevEx" (Nanocrystal Device Automatic Information Extraction Framework). NaDevEx aims at extracting information from research papers on nanocrystal devices using the NaDev corpus and machine-learning techniques. However, the characteristics of NaDevEx were not examined in detail. In this paper, we conduct system evaluation experiments for NaDevEx using the NaDev corpus. We discuss three main issues: system performance, compared with human annotators; the effect of paper type (synthesis or characterization) on system performance; and the effects of domain knowledge features (e.g., a chemical named entity recognition system and list of names of physical quantities) on system performance. We found that overall system performance was 89% in precision and 69% in recall. If we consider identification of terms that intersect with correct terms for the same information category as the correct identification, i.e., loose agreement (in many cases, we can find that appropriate head nouns such as temperature or pressure loosely match between two terms), the overall performance is 95% in precision and 74% in recall. The system performance is almost comparable with results of human annotators for information categories with rich domain knowledge information (source material). However, for other information categories, given the relatively large number of terms that exist only in one paper, recall of individual information categories is not high (39-73%); however, precision is better (75-97%). The average performance for synthesis papers is better than that for characterization papers because of the lack of training examples for characterization papers. Based on these results, we discuss future research plans for improving the performance of the system.

4.
J Cheminform ; 7(Suppl 1 Text mining for chemistry and the CHEMDNER track): S2, 2015.
Article in English | MEDLINE | ID: mdl-25810773

ABSTRACT

The automatic extraction of chemical information from text requires the recognition of chemical entity mentions as one of its key steps. When developing supervised named entity recognition (NER) systems, the availability of a large, manually annotated text corpus is desirable. Furthermore, large corpora permit the robust evaluation and comparison of different approaches that detect chemicals in documents. We present the CHEMDNER corpus, a collection of 10,000 PubMed abstracts that contain a total of 84,355 chemical entity mentions labeled manually by expert chemistry literature curators, following annotation guidelines specifically defined for this task. The abstracts of the CHEMDNER corpus were selected to be representative for all major chemical disciplines. Each of the chemical entity mentions was manually labeled according to its structure-associated chemical entity mention (SACEM) class: abbreviation, family, formula, identifier, multiple, systematic and trivial. The difficulty and consistency of tagging chemicals in text was measured using an agreement study between annotators, obtaining a percentage agreement of 91. For a subset of the CHEMDNER corpus (the test set of 3,000 abstracts) we provide not only the Gold Standard manual annotations, but also mentions automatically detected by the 26 teams that participated in the BioCreative IV CHEMDNER chemical mention recognition task. In addition, we release the CHEMDNER silver standard corpus of automatically extracted mentions from 17,000 randomly selected PubMed abstracts. A version of the CHEMDNER corpus in the BioC format has been generated as well. We propose a standard for required minimum information about entity annotations for the construction of domain specific corpora on chemical and drug entities. The CHEMDNER corpus and annotation guidelines are available at: http://www.biocreative.org/resources/biocreative-iv/chemdner-corpus/.

5.
Pest Manag Sci ; 58(11): 1132-6, 2002 Nov.
Article in English | MEDLINE | ID: mdl-12449532

ABSTRACT

Many pathogenic plant viruses are RNA viruses, which initiate production of double-stranded RNA intermediates when they replicate in host plant cells. Introduction of double-stranded RNA-specific ribonucleases such as the Schizosaccharomyces pombe derived pac I protein and animal cell derived interferon-induced 2',5'-oligoadenylate synthetase (2-5 Aase)/ribonuclease L (RNase L) system into various plants may make plants resistant to various pathogenic viruses and viroids. We have demonstrated that pac I and 2-5 Aase/RNase L transgenic tobacco plants are resistant to various viruses including tobacco mosaic virus, cucumber mosaic virus and potato virus Y. In addition, pac I transgenic potato plants are resistant to potato spindle tuber viroid. Using Agrobacterium-mediated transformation, we have established a transformation system for chrysanthemum plants and have recently developed pac I transgenic chrysanthemum (Dendranthema grandiflora cv Reagan) resistant to chrysanthemum stunt viroid and have grown them in isolated fields for an evaluation of their effects.


Subject(s)
Plant Diseases/genetics , Plant Viruses/genetics , Plants/genetics , RNA Viruses/genetics , Viroids/genetics , Chrysanthemum/genetics , Chrysanthemum/virology , Cucumovirus/genetics , Cucumovirus/growth & development , Immunity, Innate/genetics , Plant Diseases/virology , Plant Viruses/growth & development , Plants/virology , Plants, Genetically Modified , Potyvirus/genetics , Potyvirus/growth & development , RNA Viruses/growth & development , Solanum tuberosum/genetics , Solanum tuberosum/virology , Nicotiana/genetics , Nicotiana/virology , Tobacco Mosaic Virus/genetics , Tobacco Mosaic Virus/growth & development , Viral Regulatory and Accessory Proteins/genetics , Viroids/growth & development
SELECTION OF CITATIONS
SEARCH DETAIL
...