Your browser doesn't support javascript.
loading
Montrer: 20 | 50 | 100
Résultats 1 - 9 de 9
Filtre
1.
China Journal of Chinese Materia Medica ; (24): 4073-4077, 2019.
Article Dans Chinois | WPRIM | ID: wpr-1008259

Résumé

Taking the Xiushui township of Baisha county in Hainan province as the research area,the random forest algorithm with obvious advantages in feature selection and classification extraction was used to extract the information of the Callicarpa nudiflora planting in the study area. Firstly,four kinds of different characteristic variables were generated based on World View-3 data,including spectral features,principal component features,vegetation index and texture features. Secondly,the spatial distribution of the C. nudiflora in the study area was extracted by remote sensing by random forest classification algorithm. Finally,the feature space of the random forest classification algorithm was optimized based on the feature importance to obtain the best random forest classification results,and this result is compared with the classification result of the random forest algorithm of the unoptimized feature space. The results showed that:①The overall accuracy of the C. nudiflora extracted by World View-3 image was 89. 97%,and the Kappa coefficient was 0. 84,which indicates that the random forest algorithm had higher classification accuracy and better applicability in Hainan C. nudiflora recognition.② The overall accuracy of extracting C. nudiflora with the dimension reduction feature was 90. 4,and the Kappa coefficient was 0. 85,which indicates that the random forest algorithm can effectively select features. At the same time as the feature variable data mining,the precision of the information extraction of the C. nudiflora was still guaranteed,and the operation efficiency was improved. This study provides a new idea,method and technical means for information extraction of cultivated medicinal plant resources in terms of feature selection and method selection.


Sujets)
Algorithmes , Callicarpa , Plantes médicinales
2.
Healthcare Informatics Research ; : 376-380, 2018.
Article Dans Anglais | WPRIM | ID: wpr-717652

Résumé

OBJECTIVES: This research presents the design and development of a software architecture using natural language processing tools and the use of an ontology of knowledge as a knowledge base. METHODS: The software extracts, manages and represents the knowledge of a text in natural language. A corpus of more than 200 medical domain documents from the general medicine and palliative care areas was validated, demonstrating relevant knowledge elements for physicians. RESULTS: Indicators for precision, recall and F-measure were applied. An ontology was created called the knowledge elements of the medical domain to manipulate patient information, which can be read or accessed from any other software platform. CONCLUSIONS: The developed software architecture extracts the medical knowledge of the clinical histories of patients from two different corpora. The architecture was validated using the metrics of information extraction systems.


Sujets)
Humains , Mémorisation et recherche des informations , Bases de connaissances , Gestion des connaissances , Traitement du langage naturel , Soins palliatifs
3.
Chinese Journal of Medical Library and Information Science ; (12): 1-5, 2017.
Article Dans Chinois | WPRIM | ID: wpr-511115

Résumé

The steps of text mining in biomedical field and the methods used in its each step were described with stress laid on the tools used in each step of text mining in order to promote text mining in biomedical field.

4.
Journal of Medical Informatics ; (12): 7-12, 2015.
Article Dans Chinois | WPRIM | ID: wpr-476379

Résumé

〔Abstract〕 The paper briefly introduces the concept and development process of Electronic Medical Records ( EMR) , elaborates in-formation extraction of EMR as well as the methods assisting clinical decision, including machine learning, statistical learning and rule induction, etc.It describes the application of EMR assisting clinical decision in diagnostic criteria identification and clinical diagnosis ac-tivities, reflects its evidence-based significance.

5.
Subj. procesos cogn ; 14(2): 260-274, dic. 2010.
Article Dans Espagnol | LILACS | ID: lil-576374

Résumé

El procesamiento del lenguaje natural proporciona instrumentos eficaces para ayudar a investigadores a enfrentarse con el cuerpo creciente de literatura científica. Uno de los usos más acertados y bien establecidos es la extracción de la información, por ejemplo, la extracción de entidades y hechos. Esta aplicación, sin embargo, no es del todo aplicable a las ciencias sociales, ya que los mensajes principales de las publicaciones no son hechos sino argumentos. En este artículo proponemos una metodología de procesamiento del lenguaje natural destinado a detectar oraciones que comunican mensajes salientes en trabajos de investigación pertenecientes a las ciencias sociales. Consideramos dos tipos de oraciones que contienen mensajes salientes: oraciones que resumen el artículo en su totalidad o partes del artículo y las oraciones que comunican cuestiones de investigación. Tales oraciones son detectadas usando un analizador gramatical de dependencia y reglas especiales de “unión de conceptos”. En un experimento de prueba-de-concepto hemos mostrado la eficacia de nuestra proposición: buscando artículos en la base de documentos de ciencia educativa construida por el proyecto EERQI hemos descubierto que la presencia de la(s) palabra(s) de pregunta en las oraciones salientes detectadas por nuestro instrumento es un indicador importante de la importancia del artículo. Hemos comparado la importancia de los artículos recuperados con nuestro método con aquellos recuperados por el motor de búsqueda Lucene como configurado para la base de contenido de EERQI, con el ranking de importancia de omisión, que está basado en medidas de frecuencia de palabras. Los resultados son complementarios, lo cual señala la utilidad de la integración de nuestro instrumento en el Lucene.


Natural language processing provides effective tools to help researchers cope with the growing body of scientific literature. One of the most successful and well-established applications is information extraction, i.e. the extraction of named entities andfacts. This application, however, is not well suited to social sciences, since the main messages of the publications are not facts, but rather arguments. In this article we propose a natural language processing methodology in order to detect sentences that convey salient messages in social science research papers. We consider two sentencetypes that bear salient messages: sentences that sum up the entire article or parts of the article and sentences that convey research issues. Such sentences are detected using a dependency parser and special “concept-matching” rules. In a proof-of-concept experiment we have shown the effectiveness of our proposition: searching for articles in the educational science document base built by the EERQI project we have found that the presence of the query word(s) in the salient sentences detected by our tool isan important indicator of the relevance of the article. We have compared the relevance of the articles retrieved with our method with those retrieved by the Lucene search engine as configured for the EERQI content base with the default relevance ranking which is based on word frequency measures. The results are complementary, which points to the utility of the integration of our tool into Lucene.


Sujets)
Sciences sociales , Comportement de recherche d'information , Psychologie
6.
Journal of Korean Society of Medical Informatics ; : 247-254, 2009.
Article Dans Anglais | WPRIM | ID: wpr-174587

Résumé

OBJECTIVE: An automatic detection tool was created for examining health-related webpage quality we went further by examining its feasibility and performance. METHODS: We developed an automatic detection system to auto-assess the authorship quality indicator of an health-related information webpage for governmental websites in Taiwan. The system was integrated with the Chinese word segmentation system developed by the Academia Sinica in Taiwan and the SVM(light), which serve as an SVM (Support Vector Machine) Classifiers and a method of information extraction and identification. The system was coded in Visual Basic 6.0, using SQL 2000. RESULTS: We developed the first Chinese automatic webpage classification and information identifier to evaluate the quality of web information. The sensitivity and specificity of the classifier on the training set of webpages were both as high as 100% and only one health webpage in the test set was misclassified, due to the fact that it contained both health and non-health information content. The sensitivity of our authorship identifier is 75.3%, with a specificity of 87.9%. CONCLUSION: The technical feasibility of auto-assessment for the quality of health information on the web is acceptable. Although it is not sufficient to assure the total quality of web contents, it is good enough to be used to support the entire quality assurance program.


Sujets)
Humains , Asiatiques , Auteur , Indicateurs qualité santé , Sensibilité et spécificité , Taïwan
7.
The Korean Journal of Laboratory Medicine ; : 79-87, 2008.
Article Dans Coréen | WPRIM | ID: wpr-219025

Résumé

BACKGROUND: Since the human genome project was completed in 2003, there have been numerous reports on cancer and related markers. This study was aimed to develop a system to extract automatically information regarding the relationship between cancer and tumor markers from biomedical literatures. METHODS: Named entities of tumor markers were recognized by both a dictionary-based method and machine learning technology of the support vector machine. Named entities of cancers were recognized by the MeSH dictionary. RESULTS: Relational and filtering keywords were selected after annotating 160 abstracts from PubMed. Relational information was extracted only when one of the relational keywords was in an appropriate position along the parse tree of a sentence with both tumor marker and disease entities. The performance of the system developed in this study was evaluated with another set of 77 abstracts. With the relational and filtering keyword used in the system, precision was 94.38% and recall was 66.14%, while without the expert knowledge precision was 49.16% and recall was 69.29%. CONCLUSIONS: We developed a system that can extract relational information between a tumor and its markers by incorporating expert knowledge into the system. The system exploiting expert knowledge would serve as a reference when developing another information extraction system in various medical fields.


Sujets)
Humains , , Algorithmes , Systèmes de gestion de bases de données , Traitement informatique médical , Tumeurs/métabolisme , Langages de programmation , PubMed , Logiciel , Marqueurs biologiques tumoraux
8.
Journal of Korean Society of Medical Informatics ; : 267-274, 2008.
Article Dans Anglais | WPRIM | ID: wpr-168683

Résumé

OBJECTIVE: Applications to extract medical information from electronic medical records(EMRs) confront some serious obstacles such as spelling errors, ambiguous abbreviations, and unrecognizable words. Those obstacles hinder the process of finding medical entities, relations, and events. We present an efficient EMR refinement system for the purpose of medical information extraction from EMRs, not just for traditional text error correction. METHODS: The EMR refinement system has been designed and implemented through following steps: 1) Build domain constrained dictionary database, 2) Correct spelling errors in Korean-English EMR documents, 3) Resolve ambiguous abbreviations in the bilingual documents. The resulting EMR documents are now machine readable and can be applied to various applications including information extraction. RESULT: Precision rate of the refinement system for spelling error correction is 80.4% and for disambiguating abbreviations/acronyms is 94.7%. CONCLUSION: We developed an EMR refinement system to correct spelling errors and resolve ambiguous abbreviations as well as unrecognizable words. Our system can enhance the reliability of medical records and contribute to develop further application systems in the field of text mining and information extraction.


Sujets)
Fouille de données , Dossiers médicaux électroniques , Électronique , Électrons , Dossiers médicaux
9.
Journal of Korean Society of Medical Informatics ; : 57-70, 2005.
Article Dans Coréen | WPRIM | ID: wpr-128499

Résumé

OBJECTIVE: Electronic Medical Record contains the majority of clinical data in unstructured text. The information in the textual document can be stored in conceptual format and used to support clinical care by text summarization technique. In this study, we present Information Extraction(IE) using Concept Node(CN) which is extraction rule in case frame from brain radiology reports in SNUH(Seoul National University Hospital) for summarization. METHOD: Following steps are performed: design conceptual model to define semantic entities as extraction templates of brain radiology report, build CN dictionary based on statistical syntactic pattern and development of parser to extract relevant information based on defined templates. RESULTS: The three evaluation results shows that 19% precision improvement after post processing supplemental specified complex verb construction and 19.24~21.25% accurate semantic effectiveness with extracting additional Korean noun. The average of precision is 85.18%, average of recall is 93.71% and F-measure is 0.89. CONCLUSION: Our approach has advantageous elements for different language at the same sentence. We expect this IE technology can summarize vast amount radiology texts material for clinical decision support system effectively and hope this study helps the evolution of clinical data representation in Korean medical records and its integration into the EMR in the future.


Sujets)
Encéphale , Dossiers médicaux électroniques , Espoir , Mémorisation et recherche des informations , Dossiers médicaux , Sémantique
SÉLECTION CITATIONS
Détails de la recherche