Pesquisa | Portal Regional da BVS (teste)

1.

ConcVAE: Conceptual Representation Learning.

Togo, Ren; Nakagawa, Nao; Ogawa, Takahiro; Haseyama, Miki.

IEEE Trans Neural Netw Learn Syst ; PP2024 Jul 03.

Artigo em Inglês | MEDLINE | ID: mdl-38959142

RESUMO

Disentangled representation learning aims at obtaining an independent latent representation without supervisory signals. However, the independence of a representation does not guarantee interpretability to match human intuition in the unsupervised settings. In this article, we introduce conceptual representation learning, an unsupervised strategy to learn a representation and its concepts. An antonym pair forms a concept, which determines the semantically meaningful axes in the latent space. Since the connection between signifying words and signified notions is arbitrary in natural languages, the verbalization of data features makes the representation make sense to humans. We thus construct Conceptual VAE (ConcVAE), a variational autoencoder (VAE)-based generative model with an explicit process in which the semantic representation of data is generated via trainable concepts. In visual data, ConcVAE utilizes natural language arbitrariness as an inductive bias of unsupervised learning by using a vision-language pretraining, which can tell an unsupervised model what makes sense to humans. Qualitative and quantitative evaluations show that the conceptual inductive bias in ConcVAE effectively disentangles the latent representation in a sense-making manner without supervision. Code is available at https://github.com/ganmodokix/concvae.

2.

Multimodal Transformer Model Using Time-Series Data to Classify Winter Road Surface Conditions.

Moroto, Yuya; Maeda, Keisuke; Togo, Ren; Ogawa, Takahiro; Haseyama, Miki.

Sensors (Basel) ; 24(11)2024 May 27.

Artigo em Inglês | MEDLINE | ID: mdl-38894233

RESUMO

This paper proposes a multimodal Transformer model that uses time-series data to detect and predict winter road surface conditions. For detecting or predicting road surface conditions, the previous approach focuses on the cooperative use of multiple modalities as inputs, e.g., images captured by fixed-point cameras (road surface images) and auxiliary data related to road surface conditions under simple modality integration. Although such an approach achieves performance improvement compared to the method using only images or auxiliary data, there is a demand for further consideration of the way to integrate heterogeneous modalities. The proposed method realizes a more effective modality integration using a cross-attention mechanism and time-series processing. Concretely, when integrating multiple modalities, feature compensation through mutual complementation between modalities is realized through a feature integration technique based on a cross-attention mechanism, and the representational ability of the integrated features is enhanced. In addition, by introducing time-series processing for the input data across several timesteps, it is possible to consider the temporal changes in the road surface conditions. Experiments are conducted for both detection and prediction tasks using data corresponding to the current winter condition and data corresponding to a few hours after the current winter condition, respectively. The experimental results verify the effectiveness of the proposed method for both tasks. In addition to the construction of the classification model for winter road surface conditions, we first attempt to visualize the classification results, especially the prediction results, through the image style transfer model as supplemental extended experiments on image generation at the end of the paper.

3.

A Novel Frame-Selection Metric for Video Inpainting to Enhance Urban Feature Extraction.

Feng, Yuhu; Zhang, Jiahuan; Li, Guang; Togo, Ren; Maeda, Keisuke; Ogawa, Takahiro; Haseyama, Miki.

Sensors (Basel) ; 24(10)2024 May 10.

Artigo em Inglês | MEDLINE | ID: mdl-38793890

RESUMO

In our digitally driven society, advances in software and hardware to capture video data allow extensive gathering and analysis of large datasets. This has stimulated interest in extracting information from video data, such as buildings and urban streets, to enhance understanding of the environment. Urban buildings and streets, as essential parts of cities, carry valuable information relevant to daily life. Extracting features from these elements and integrating them with technologies such as VR and AR can contribute to more intelligent and personalized urban public services. Despite its potential benefits, collecting videos of urban environments introduces challenges because of the presence of dynamic objects. The varying shape of the target building in each frame necessitates careful selection to ensure the extraction of quality features. To address this problem, we propose a novel evaluation metric that considers the video-inpainting-restoration quality and the relevance of the target object, considering minimizing areas with cars, maximizing areas with the target building, and minimizing overlapping areas. This metric extends existing video-inpainting-evaluation metrics by considering the relevance of the target object and interconnectivity between objects. We conducted experiment to validate the proposed metrics using real-world datasets from Japanese cities Sapporo and Yokohama. The experiment results demonstrate feasibility of selecting video frames conducive to building feature extraction.

4.

Analysis of Continual Learning Techniques for Image Generative Models with Learned Class Information Management.

Togo, Taro; Togo, Ren; Maeda, Keisuke; Ogawa, Takahiro; Haseyama, Miki.

Sensors (Basel) ; 24(10)2024 May 13.

Artigo em Inglês | MEDLINE | ID: mdl-38793943

RESUMO

The advancements in deep learning have significantly enhanced the capability of image generation models to produce images aligned with human intentions. However, training and adapting these models to new data and tasks remain challenging because of their complexity and the risk of catastrophic forgetting. This study proposes a method for addressing these challenges involving the application of class-replacement techniques within a continual learning framework. This method utilizes selective amnesia (SA) to efficiently replace existing classes with new ones while retaining crucial information. This approach improves the model's adaptability to evolving data environments while preventing the loss of past information. We conducted a detailed evaluation of class-replacement techniques, examining their impact on the "class incremental learning" performance of models and exploring their applicability in various scenarios. The experimental results demonstrated that our proposed method could enhance the learning efficiency and long-term performance of image generation models. This study broadens the application scope of image generation technology and supports the continual improvement and adaptability of corresponding models.

5.

Importance-aware adaptive dataset distillation.

Li, Guang; Togo, Ren; Ogawa, Takahiro; Haseyama, Miki.

Neural Netw ; 172: 106154, 2024 Apr.

Artigo em Inglês | MEDLINE | ID: mdl-38309137

RESUMO

Herein, we propose a novel dataset distillation method for constructing small informative datasets that preserve the information of the large original datasets. The development of deep learning models is enabled by the availability of large-scale datasets. Despite unprecedented success, large-scale datasets considerably increase the storage and transmission costs, resulting in a cumbersome model training process. Moreover, using raw data for training raises privacy and copyright concerns. To address these issues, a new task named dataset distillation has been introduced, aiming to synthesize a compact dataset that retains the essential information from the large original dataset. State-of-the-art (SOTA) dataset distillation methods have been proposed by matching gradients or network parameters obtained during training on real and synthetic datasets. The contribution of different network parameters to the distillation process varies, and uniformly treating them leads to degraded distillation performance. Based on this observation, we propose an importance-aware adaptive dataset distillation (IADD) method that can improve distillation performance by automatically assigning importance weights to different network parameters during distillation, thereby synthesizing more robust distilled datasets. IADD demonstrates superior performance over other SOTA dataset distillation methods based on parameter matching on multiple benchmark datasets and outperforms them in terms of cross-architecture generalization. In addition, the analysis of self-adaptive weights demonstrates the effectiveness of IADD. Furthermore, the effectiveness of IADD is validated in a real-world medical application such as COVID-19 detection.

Assuntos

COVID-19 , Destilação , Humanos , Benchmarking , Generalização Psicológica , Privacidade

6.

Text-Guided Image Editing Based on Post Score for Gaining Attention on Social Media.

Watanabe, Yuto; Togo, Ren; Maeda, Keisuke; Ogawa, Takahiro; Haseyama, Miki.

Sensors (Basel) ; 24(3)2024 Jan 31.

Artigo em Inglês | MEDLINE | ID: mdl-38339636

RESUMO

Text-guided image editing has been highlighted in the fields of computer vision and natural language processing in recent years. The approach takes an image and text prompt as input and aims to edit the image in accordance with the text prompt while preserving text-unrelated regions. The results of text-guided image editing differ depending on the way the text prompt is represented, even if it has the same meaning. It is up to the user to decide which result best matches the intended use of the edited image. This paper assumes a situation in which edited images are posted to social media and proposes a novel text-guided image editing method to help the edited images gain attention from a greater audience. In the proposed method, we apply the pre-trained text-guided image editing method and obtain multiple edited images from the multiple text prompts generated from a large language model. The proposed method leverages the novel model that predicts post scores representing engagement rates and selects one image that will gain the most attention from the audience on social media among these edited images. Subject experiments on a dataset of real Instagram posts demonstrate that the edited images of the proposed method accurately reflect the content of the text prompts and provide a positive impression to the audience on social media compared to those of previous text-guided image editing methods.

Assuntos

Mídias Sociais , Humanos , Idioma , Processamento de Linguagem Natural

7.

Zero-Shot Traffic Sign Recognition Based on Midlevel Feature Matching.

Gan, Yaozong; Li, Guang; Togo, Ren; Maeda, Keisuke; Ogawa, Takahiro; Haseyama, Miki.

Sensors (Basel) ; 23(23)2023 Dec 04.

Artigo em Inglês | MEDLINE | ID: mdl-38067982

RESUMO

Traffic sign recognition is a complex and challenging yet popular problem that can assist drivers on the road and reduce traffic accidents. Most existing methods for traffic sign recognition use convolutional neural networks (CNNs) and can achieve high recognition accuracy. However, these methods first require a large number of carefully crafted traffic sign datasets for the training process. Moreover, since traffic signs differ in each country and there is a variety of traffic signs, these methods need to be fine-tuned when recognizing new traffic sign categories. To address these issues, we propose a traffic sign matching method for zero-shot recognition. Our proposed method can perform traffic sign recognition without training data by directly matching the similarity of target and template traffic sign images. Our method uses the midlevel features of CNNs to obtain robust feature representations of traffic signs without additional training or fine-tuning. We discovered that midlevel features improve the accuracy of zero-shot traffic sign recognition. The proposed method achieves promising recognition results on the German Traffic Sign Recognition Benchmark open dataset and a real-world dataset taken from Sapporo City, Japan.

8.

Manipulation Direction: Evaluating Text-Guided Image Manipulation Based on Similarity between Changes in Image and Text Modalities.

Watanabe, Yuto; Togo, Ren; Maeda, Keisuke; Ogawa, Takahiro; Haseyama, Miki.

Sensors (Basel) ; 23(22)2023 Nov 20.

Artigo em Inglês | MEDLINE | ID: mdl-38005673

RESUMO

At present, text-guided image manipulation is a notable subject of study in the vision and language field. Given an image and text as inputs, these methods aim to manipulate the image according to the text, while preserving text-irrelevant regions. Although there has been extensive research to improve the versatility and performance of text-guided image manipulation, research on its performance evaluation is inadequate. This study proposes Manipulation Direction (MD), a logical and robust metric, which evaluates the performance of text-guided image manipulation by focusing on changes between image and text modalities. Specifically, we define MD as the consistency of changes between images and texts occurring before and after manipulation. By using MD to evaluate the performance of text-guided image manipulation, we can comprehensively evaluate how an image has changed before and after the image manipulation and whether this change agrees with the text. Extensive experiments on Multi-Modal-CelebA-HQ and Caltech-UCSD Birds confirmed that there was an impressive correlation between our calculated MD scores and subjective scores for the manipulated images compared to the existing metrics.

9.

Off-Screen Sound Separation Based on Audio-visual Pre-training Using Binaural Audio.

Yoshida, Masaki; Togo, Ren; Ogawa, Takahiro; Haseyama, Miki.

Sensors (Basel) ; 23(9)2023 May 07.

Artigo em Inglês | MEDLINE | ID: mdl-37177744

RESUMO

This study proposes a novel off-screen sound separation method based on audio-visual pre-training. In the field of audio-visual analysis, researchers have leveraged visual information for audio manipulation tasks, such as sound source separation. Although such audio manipulation tasks are based on correspondences between audio and video, these correspondences are not always established. Specifically, sounds coming from outside a screen have no audio-visual correspondences and thus interfere with conventional audio-visual learning. The proposed method separates such off-screen sounds based on their arrival directions using binaural audio, which provides us with three-dimensional sensation. Furthermore, we propose a new pre-training method that can consider the off-screen space and use the obtained representation to improve off-screen sound separation. Consequently, the proposed method can separate off-screen sounds irrespective of the direction from which they arrive. We conducted our evaluation using generated video data to circumvent the problem of difficulty in collecting ground truth for off-screen sounds. We confirmed the effectiveness of our methods through off-screen sound detection and separation tasks.

10.

Self-supervised learning for gastritis detection with gastric X-ray images.

Li, Guang; Togo, Ren; Ogawa, Takahiro; Haseyama, Miki.

Int J Comput Assist Radiol Surg ; 18(10): 1841-1848, 2023 Oct.

Artigo em Inglês | MEDLINE | ID: mdl-37040011

RESUMO

PURPOSE: Manual annotation of gastric X-ray images by doctors for gastritis detection is time-consuming and expensive. To solve this, a self-supervised learning method is developed in this study. The effectiveness of the proposed self-supervised learning method in gastritis detection is verified using a few annotated gastric X-ray images. METHODS: In this study, we develop a novel method that can perform explicit self-supervised learning and learn discriminative representations from gastric X-ray images. Models trained based on the proposed method were fine-tuned on datasets comprising a few annotated gastric X-ray images. Five self-supervised learning methods, i.e., SimSiam, BYOL, PIRL-jigsaw, PIRL-rotation, and SimCLR, were compared with the proposed method. Furthermore, three previous methods, one pretrained on ImageNet, one trained from scratch, and one semi-supervised learning method, were compared with the proposed method. RESULTS: The proposed method's harmonic mean score of sensitivity and specificity after fine-tuning with the annotated data of 10, 20, 30, and 40 patients were 0.875, 0.911, 0.915, and 0.931, respectively. The proposed method outperformed all comparative methods, including the five self-supervised learning and three previous methods. Experimental results showed the effectiveness of the proposed method in gastritis detection using a few annotated gastric X-ray images. CONCLUSIONS: This paper proposes a novel self-supervised learning method based on a teacher-student architecture for gastritis detection using gastric X-ray images. The proposed method can perform explicit self-supervised learning and learn discriminative representations from gastric X-ray images. The proposed method exhibits potential clinical use in gastritis detection using a few annotated gastric X-ray images.

Assuntos

Gastrite , Humanos , Raios X , Gastrite/diagnóstico por imagem , Rotação , Aprendizado de Máquina Supervisionado

11.

Boosting automatic COVID-19 detection performance with self-supervised learning and batch knowledge ensembling.

Li, Guang; Togo, Ren; Ogawa, Takahiro; Haseyama, Miki.

Comput Biol Med ; 158: 106877, 2023 05.

Artigo em Inglês | MEDLINE | ID: mdl-37019015

RESUMO

PROBLEM: Detecting COVID-19 from chest X-ray (CXR) images has become one of the fastest and easiest methods for detecting COVID-19. However, the existing methods usually use supervised transfer learning from natural images as a pretraining process. These methods do not consider the unique features of COVID-19 and the similar features between COVID-19 and other pneumonia. AIM: In this paper, we want to design a novel high-accuracy COVID-19 detection method that uses CXR images, which can consider the unique features of COVID-19 and the similar features between COVID-19 and other pneumonia. METHODS: Our method consists of two phases. One is self-supervised learning-based pertaining; the other is batch knowledge ensembling-based fine-tuning. Self-supervised learning-based pretraining can learn distinguished representations from CXR images without manually annotated labels. On the other hand, batch knowledge ensembling-based fine-tuning can utilize category knowledge of images in a batch according to their visual feature similarities to improve detection performance. Unlike our previous implementation, we introduce batch knowledge ensembling into the fine-tuning phase, reducing the memory used in self-supervised learning and improving COVID-19 detection accuracy. RESULTS: On two public COVID-19 CXR datasets, namely, a large dataset and an unbalanced dataset, our method exhibited promising COVID-19 detection performance. Our method maintains high detection accuracy even when annotated CXR training images are reduced significantly (e.g., using only 10% of the original dataset). In addition, our method is insensitive to changes in hyperparameters. CONCLUSION: The proposed method outperforms other state-of-the-art COVID-19 detection methods in different settings. Our method can reduce the workloads of healthcare providers and radiologists.

Assuntos

COVID-19 , Humanos , COVID-19/diagnóstico por imagem , Radiologistas , Tórax , Extremidade Superior , Aprendizado de Máquina Supervisionado

12.

Diversity Learning Based on Multi-Latent Space for Medical Image Visual Question Generation.

Zhu, He; Togo, Ren; Ogawa, Takahiro; Haseyama, Miki.

Sensors (Basel) ; 23(3)2023 Jan 17.

Artigo em Inglês | MEDLINE | ID: mdl-36772095

RESUMO

Auxiliary clinical diagnosis has been researched to solve unevenly and insufficiently distributed clinical resources. However, auxiliary diagnosis is still dominated by human physicians, and how to make intelligent systems more involved in the diagnosis process is gradually becoming a concern. An interactive automated clinical diagnosis with a question-answering system and a question generation system can capture a patient's conditions from multiple perspectives with less physician involvement by asking different questions to drive and guide the diagnosis. This clinical diagnosis process requires diverse information to evaluate a patient from different perspectives to obtain an accurate diagnosis. Recently proposed medical question generation systems have not considered diversity. Thus, we propose a diversity learning-based visual question generation model using a multi-latent space to generate informative question sets from medical images. The proposed method generates various questions by embedding visual and language information in different latent spaces, whose diversity is trained by our newly proposed loss. We have also added control over the categories of generated questions, making the generated questions directional. Furthermore, we use a new metric named similarity to accurately evaluate the proposed model's performance. The experimental results on the Slake and VQA-RAD datasets demonstrate that the proposed method can generate questions with diverse information. Our model works with an answering model for interactive automated clinical diagnosis and generates datasets to replace the process of annotation that incurs huge labor costs.

Assuntos

Processamento de Linguagem Natural , Semântica , Humanos , Idioma

13.

COVID-19 detection based on self-supervised transfer learning using chest X-ray images.

Li, Guang; Togo, Ren; Ogawa, Takahiro; Haseyama, Miki.

Int J Comput Assist Radiol Surg ; 18(4): 715-722, 2023 Apr.

Artigo em Inglês | MEDLINE | ID: mdl-36538184

RESUMO

PURPOSE: Considering several patients screened due to COVID-19 pandemic, computer-aided detection has strong potential in assisting clinical workflow efficiency and reducing the incidence of infections among radiologists and healthcare providers. Since many confirmed COVID-19 cases present radiological findings of pneumonia, radiologic examinations can be useful for fast detection. Therefore, chest radiography can be used to fast screen COVID-19 during the patient triage, thereby determining the priority of patient's care to help saturated medical facilities in a pandemic situation. METHODS: In this paper, we propose a new learning scheme called self-supervised transfer learning for detecting COVID-19 from chest X-ray (CXR) images. We compared six self-supervised learning (SSL) methods (Cross, BYOL, SimSiam, SimCLR, PIRL-jigsaw, and PIRL-rotation) with the proposed method. Additionally, we compared six pretrained DCNNs (ResNet18, ResNet50, ResNet101, CheXNet, DenseNet201, and InceptionV3) with the proposed method. We provide quantitative evaluation on the largest open COVID-19 CXR dataset and qualitative results for visual inspection. RESULTS: Our method achieved a harmonic mean (HM) score of 0.985, AUC of 0.999, and four-class accuracy of 0.953. We also used the visualization technique Grad-CAM++ to generate visual explanations of different classes of CXR images with the proposed method to increase the interpretability. CONCLUSIONS: Our method shows that the knowledge learned from natural images using transfer learning is beneficial for SSL of the CXR images and boosts the performance of representation learning for COVID-19 detection. Our method promises to reduce the incidence of infections among radiologists and healthcare providers.

Assuntos

COVID-19 , Humanos , COVID-19/diagnóstico por imagem , Pandemias , Raios X , Tórax , Aprendizado de Máquina

14.

Trial Analysis of the Relationship between Taste and Biological Information Obtained While Eating Strawberries for Sensory Evaluation.

Maeda, Keisuke; Togo, Ren; Ogawa, Takahiro; Adachi, Shin-Ichi; Yoshizawa, Fumiaki; Haseyama, Miki.

Sensors (Basel) ; 22(23)2022 Dec 05.

Artigo em Inglês | MEDLINE | ID: mdl-36502199

RESUMO

This paper presents a trial analysis of the relationship between taste and biological information obtained while eating strawberries (for a sensory evaluation). This study used the visual analog scale (VAS); we collected questionnaires used in previous studies and human brain activity obtained while eating strawberries. In our analysis, we assumed that brain activity is highly correlated with taste. Then, the relationships between brain activity and other data, such as VAS and questionnaires, could be analyzed through a canonical correlation analysis, which is a multivariate analysis. Through an analysis of brain activity, the potential relationship with "taste" (that is not revealed by the initial simple correlation analysis) can be discovered. This is the main contribution of this study. In the experiments, we discovered the potential relationship between cultural factors (in the questionnaires) and taste. We also found a strong relationship between taste and individual information. In particular, the analysis of cross-loading between brain activity and individual information suggests that acidity and the sugar-to-acid ratio are related to taste.

Assuntos

Fragaria , Humanos , Frutas

15.

Distress Detection in Subway Tunnel Images via Data Augmentation Based on Selective Image Cropping and Patching.

Maeda, Keisuke; Takada, Saya; Haruyama, Tomoki; Togo, Ren; Ogawa, Takahiro; Haseyama, Miki.

Sensors (Basel) ; 22(22)2022 Nov 18.

Artigo em Inglês | MEDLINE | ID: mdl-36433529

RESUMO

Distresses, such as cracks, directly reflect the structural integrity of subway tunnels. Therefore, the detection of subway tunnel distress is an essential task in tunnel structure maintenance. This paper presents the performance improvement of deep learning-based distress detection to support the maintenance of subway tunnels through a new data augmentation method, selective image cropping and patching (SICAP). Specifically, we generate effective data for training the distress detection model by focusing on the distressed regions via SICAP. After the data augmentation, we train a distress detection model using the expanded training data. The new image generated based on SICAP does not change the pixel values of the original image. Thus, there is little loss of information, and the generated images are effective in constructing a robust model for various subway tunnel lines. We conducted experiments with some comparative methods. The experimental results show that the detection performance can be improved by our data augmentation.

Assuntos

Ferrovias

16.

Compressed gastric image generation based on soft-label dataset distillation for medical data sharing.

Li, Guang; Togo, Ren; Ogawa, Takahiro; Haseyama, Miki.

Comput Methods Programs Biomed ; 227: 107189, 2022 Dec.

Artigo em Inglês | MEDLINE | ID: mdl-36323177

RESUMO

BACKGROUND AND OBJECTIVE: Sharing of medical data is required to enable the cross-agency flow of healthcare information and construct high-accuracy computer-aided diagnosis systems. However, the large sizes of medical datasets, the massive amount of memory of saved deep convolutional neural network (DCNN) models, and patients' privacy protection are problems that can lead to inefficient medical data sharing. Therefore, this study proposes a novel soft-label dataset distillation method for medical data sharing. METHODS: The proposed method distills valid information of medical image data and generates several compressed images with different data distributions for anonymous medical data sharing. Furthermore, our method can extract essential weights of DCNN models to reduce the memory required to save trained models for efficient medical data sharing. RESULTS: The proposed method can compress tens of thousands of images into several soft-label images and reduce the size of a trained model to a few hundredths of its original size. The compressed images obtained after distillation have been visually anonymized; therefore, they do not contain the private information of the patients. Furthermore, we can realize high-detection performance with a small number of compressed images. CONCLUSIONS: The experimental results show that the proposed method can improve the efficiency and security of medical data sharing.

Assuntos

Destilação , Redes Neurais de Computação , Humanos , Privacidade , Disseminação de Informação , Atenção à Saúde

17.

Controllable Music Playlist Generation Based on Knowledge Graph and Reinforcement Learning.

Sakurai, Keigo; Togo, Ren; Ogawa, Takahiro; Haseyama, Miki.

Sensors (Basel) ; 22(10)2022 May 13.

Artigo em Inglês | MEDLINE | ID: mdl-35632130

RESUMO

In this study, we propose a novel music playlist generation method based on a knowledge graph and reinforcement learning. The development of music streaming platforms has transformed the social dynamics of music consumption and paved a new way of accessing and listening to music. The playlist generation is one of the most important multimedia techniques, which aims to recommend music tracks by sensing the vast amount of musical data and the users' listening histories from music streaming services. Conventional playlist generation methods have difficulty capturing the target users' long-term preferences. To overcome the difficulty, we use a reinforcement learning algorithm that can consider the target users' long-term preferences. Furthermore, we introduce the following two new ideas: using the informative knowledge graph data to promote efficient optimization through reinforcement learning, and setting the flexible reward function that target users can design the parameters of itself to guide target users to new types of music tracks. We confirm the effectiveness of the proposed method by verifying the prediction performance based on listening history and the guidance performance to music tracks that can satisfy the target user's unique preference.

Assuntos

Música , Percepção Auditiva , Conhecimento , Reconhecimento Automatizado de Padrão , Recompensa

18.

Defect Detection of Subway Tunnels Using Advanced U-Net Network.

Wang, An; Togo, Ren; Ogawa, Takahiro; Haseyama, Miki.

Sensors (Basel) ; 22(6)2022 Mar 17.

Artigo em Inglês | MEDLINE | ID: mdl-35336501

RESUMO

In this paper, we present a novel defect detection model based on an improved U-Net architecture. As a semantic segmentation task, the defect detection task has the problems of background-foreground imbalance, multi-scale targets, and feature similarity between the background and defects in the real-world data. Conventionally, general convolutional neural network (CNN)-based networks mainly focus on natural image tasks, which are insensitive to the problems in our task. The proposed method has a network design for multi-scale segmentation based on the U-Net architecture including an atrous spatial pyramid pooling (ASPP) module and an inception module, and can detect various types of defects compared to conventional simple CNN-based methods. Through the experiments using a real-world subway tunnel image dataset, the proposed method showed a better performance than that of general semantic segmentation including state-of-the-art methods. Additionally, we showed that our method can achieve excellent detection balance among multi-scale defects.

Assuntos

Processamento de Imagem Assistida por Computador , Ferrovias , Algoritmos , Processamento de Imagem Assistida por Computador/métodos , Redes Neurais de Computação

19.

Rubber Material Property Prediction Using Electron Microscope Images of Internal Structures Taken under Multiple Conditions.

Togo, Ren; Saito, Naoki; Maeda, Keisuke; Ogawa, Takahiro; Haseyama, Miki.

Sensors (Basel) ; 21(6)2021 Mar 16.

Artigo em Inglês | MEDLINE | ID: mdl-33809765

RESUMO

A method for prediction of properties of rubber materials utilizing electron microscope images of internal structures taken under multiple conditions is presented in this paper. Electron microscope images of rubber materials are taken under several conditions, and effective conditions for the prediction of properties are different for each rubber material. Novel approaches for the selection and integration of reliable prediction results are used in the proposed method. The proposed method enables selection of reliable results based on prediction intervals that can be derived by the predictors that are each constructed from electron microscope images taken under each condition. By monitoring the relationship between prediction results and prediction intervals derived from the corresponding predictors, it can be determined whether the target prediction results are reliable. Furthermore, the proposed method integrates the selected reliable results based on Dempster-Shafer (DS) evidence theory, and this integration result is regarded as a final prediction result. The DS evidence theory enables integration of multiple prediction results, even if the results are obtained from different imaging conditions. This means that integration can even be realized if electron microscope images of each material are taken under different conditions and even if these conditions are different for target materials. This nonconventional approach is suitable for our application, i.e., property prediction. Experiments on rubber material data showed that the evaluation index mean absolute percent error (MAPE) was under 10% by the proposed method. The performance of the proposed method outperformed conventional comparative property estimation methods. Consequently, the proposed method can realize accurate prediction of the properties with consideration of the characteristic of electron microscope images described above.

20.

Preliminary study of AI-assisted diagnosis using FDG-PET/CT for axillary lymph node metastasis in patients with breast cancer.

Li, Zongyao; Kitajima, Kazuhiro; Hirata, Kenji; Togo, Ren; Takenaka, Junki; Miyoshi, Yasuo; Kudo, Kohsuke; Ogawa, Takahiro; Haseyama, Miki.

EJNMMI Res ; 11(1): 10, 2021 Jan 25.

Artigo em Inglês | MEDLINE | ID: mdl-33492478

RESUMO

BACKGROUND: To improve the diagnostic accuracy of axillary lymph node (LN) metastasis in breast cancer patients using 2-[18F]FDG-PET/CT, we constructed an artificial intelligence (AI)-assisted diagnosis system that uses deep-learning technologies. MATERIALS AND METHODS: Two clinicians and the new AI system retrospectively analyzed and diagnosed 414 axillae of 407 patients with biopsy-proven breast cancer who had undergone 2-[18F]FDG-PET/CT before a mastectomy or breast-conserving surgery with a sentinel lymph node (LN) biopsy and/or axillary LN dissection. We designed and trained a deep 3D convolutional neural network (CNN) as the AI model. The diagnoses from the clinicians were blended with the diagnoses from the AI model to improve the diagnostic accuracy. RESULTS: Although the AI model did not outperform the clinicians, the diagnostic accuracies of the clinicians were considerably improved by collaborating with the AI model: the two clinicians' sensitivities of 59.8% and 57.4% increased to 68.6% and 64.2%, respectively, whereas the clinicians' specificities of 99.0% and 99.5% remained unchanged. CONCLUSIONS: It is expected that AI using deep-learning technologies will be useful in diagnosing axillary LN metastasis using 2-[18F]FDG-PET/CT. Even if the diagnostic performance of AI is not better than that of clinicians, taking AI diagnoses into consideration may positively impact the overall diagnostic accuracy.

RESUMO

RESUMO

RESUMO

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA