Evaluation of Effectiveness of Pre-Training Method in Chest X-Ray Imaging Using Vision Transformer (preprint)

Kuniki Imagawa; Kohei Shiomoto

Este artigo é um Preprint

Preprints são relatos preliminares de pesquisa que não foram certificados pela revisão por pares. Eles não devem ser considerados para orientar a prática clínica ou comportamentos relacionados à saúde e não devem ser publicados na mídia como informação estabelecida.

Preprints publicados online permitem que os autores recebam feedback rápido, e toda a comunidade científica pode avaliar o trabalho independentemente e responder adequadamente. Estes comentários são publicados juntamente com os preprints para qualquer pessoa ler e servir como uma avaliação pós-publicação.

Evaluation of Effectiveness of Pre-Training Method in Chest X-Ray Imaging Using Vision Transformer (preprint)

Kuniki Imagawa; Kohei Shiomoto.

ssrn; 2023.

Preprint em Inglês | PREPRINT-SSRN | ID: ppzbmed-10.2139.ssrn.4507834

ABSTRACT

ABSTRACT

The limited availability of medical images is a major limitation when using deep learning, which requires large amounts of data to improve performance. To address this problem, transfer learning has become the de facto standard, using convolutional neural networks (CNNs) previously trained on natural images, such as ImageNet, and fine-tuned with medical images. Recently, vision transformers (ViT), which require large annotated medical images, have been studied from various perspectives. In this study, we investigated an effective pre-training method, especially for ViT. Specifically, an evaluation of the binary classification of COVID-19 and normal chest X-ray images was conducted. The following conclusions were drawn from the evaluation (1) the fine-tuning method was more effective than the feature extraction method; (2) pre-trained natural images as a fine-tuning method are more effective than task-specific images, namely medical images; (3) the pre-trained natural images learned more Position Embeddings (PEs) with long-range dependencies than medical images; (4) ViT is more effective than CNNs when there are a large number of pre-training natural images, and vice versa when the number of pre-training natural images is limited. These results suggest that the fine-tuning method with a large number of natural images as pre-training data using ViT had the best discrimination performance for the binary classification in this study.

Assuntos

COVID-19; Transtornos Relacionados ao Uso de Substâncias

Texto completo

Imprimir

XML

Buscar no Google

Texto completo: Disponível Coleções: Preprints Base de dados: PREPRINT-SSRN Assunto principal: Transtornos Relacionados ao Uso de Substâncias / COVID-19 Idioma: Inglês Ano de publicação: 2023 Tipo de documento: Preprint

Similares

MEDLINE

LILACS

LIS

Texto completo

Imprimir

XML

Buscar no Google