Your browser doesn't support javascript.
loading
Advanced Sampling Technique in Radiology Free-Text Data for Efficiently Building Text Mining Models by Deep Learning in Vertebral Fracture.
Hung, Wei-Chieh; Lin, Yih-Lon; Lin, Chi-Wei; Chin, Wei-Leng; Wu, Chih-Hsing.
Afiliación
  • Hung WC; Department of Family and Community Medicine, E-Da Hospital, I-Shou University, Kaohsiung 82445, Taiwan.
  • Lin YL; School of Medicine, I-Shou University, Kaohsiung 84001, Taiwan.
  • Lin CW; Institute of Biotechnology and Chemical Engineering, I-Shou University, Kaohsiung 84001, Taiwan.
  • Chin WL; Department of Computer Science and Information Engineering, National Yunlin University of Science and Technology, Douliu 64002, Taiwan.
  • Wu CH; Department of Family and Community Medicine, E-Da Hospital, I-Shou University, Kaohsiung 82445, Taiwan.
Diagnostics (Basel) ; 14(2)2024 Jan 08.
Article en En | MEDLINE | ID: mdl-38248014
ABSTRACT
This study aims to establish advanced sampling methods in free-text data for efficiently building semantic text mining models using deep learning, such as identifying vertebral compression fracture (VCF) in radiology reports. We enrolled a total of 27,401 radiology free-text reports of X-ray examinations of the spine. The predictive effects were compared between text mining models built using supervised long short-term memory networks, independently derived by four sampling

methods:

vector sum minimization, vector sum maximization, stratified, and simple random sampling, using four fixed percentages. The drawn samples were applied to the training set, and the remaining samples were used to validate each group using different sampling methods and ratios. The predictive accuracy was measured using the area under the receiver operating characteristics (AUROC) to identify VCF. At the sampling ratios of 1/10, 1/20, 1/30, and 1/40, the highest AUROC was revealed in the sampling methods of vector sum minimization as confidence intervals of 0.981 (95%CIs 0.980-0.983)/0.963 (95%CIs 0.961-0.965)/0.907 (95%CIs 0.904-0.911)/0.895 (95%CIs 0.891-0.899), respectively. The lowest AUROC was demonstrated in the vector sum maximization. This study proposes an advanced sampling method, vector sum minimization, in free-text data that can be efficiently applied to build the text mining models by smartly drawing a small amount of critical representative samples.
Palabras clave

Texto completo: 1 Colección: 01-internacional Base de datos: MEDLINE Tipo de estudio: Prognostic_studies Idioma: En Revista: Diagnostics (Basel) Año: 2024 Tipo del documento: Article País de afiliación: Taiwán Pais de publicación: Suiza

Texto completo: 1 Colección: 01-internacional Base de datos: MEDLINE Tipo de estudio: Prognostic_studies Idioma: En Revista: Diagnostics (Basel) Año: 2024 Tipo del documento: Article País de afiliación: Taiwán Pais de publicación: Suiza