Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 125
Filtrar
1.
Sensors (Basel) ; 24(13)2024 Jun 29.
Artigo em Inglês | MEDLINE | ID: mdl-39001006

RESUMO

Infrared small target detection technology plays a crucial role in various fields such as military reconnaissance, power patrol, medical diagnosis, and security. The advancement of deep learning has led to the success of convolutional neural networks in target segmentation. However, due to challenges like small target scales, weak signals, and strong background interference in infrared images, convolutional neural networks often face issues like leakage and misdetection in small target segmentation tasks. To address this, an enhanced U-Net method called MST-UNet is proposed, the method combines multi-scale feature decomposition and fusion and attention mechanisms. The method involves using Haar wavelet transform instead of maximum pooling for downsampling in the encoder to minimize feature loss and enhance feature utilization. Additionally, a multi-scale residual unit is introduced to extract contextual information at different scales, improving sensory field and feature expression. The inclusion of a triple attention mechanism in the encoder structure further enhances multidimensional information utilization and feature recovery by the decoder. Experimental analysis on the NUDT-SIRST dataset demonstrates that the proposed method significantly improves target contour accuracy and segmentation precision, achieving IoU and nIoU values of 80.09% and 80.19%, respectively.

2.
Sensors (Basel) ; 24(14)2024 Jul 19.
Artigo em Inglês | MEDLINE | ID: mdl-39066079

RESUMO

Ensuring the safety of mechanical equipment, gearbox fault diagnosis is crucial for the stable operation of the whole system. However, existing diagnostic methods still have limitations, such as the analysis of single-scale features and insufficient recognition of global temporal dependencies. To address these issues, this article proposes a new method for gearbox fault diagnosis based on MSCNN-LSTM-CBAM-SE. The output of the CBAM-SE module is deeply integrated with the multi-scale features from MSCNN and the temporal features from LSTM, constructing a comprehensive feature representation that provides richer and more precise information for fault diagnosis. The effectiveness of this method has been validated with two sets of gearbox datasets and through ablation studies on this model. Experimental results show that the proposed model achieves excellent performance in terms of accuracy and F1 score, among other metrics. Finally, a comparison with other relevant fault diagnosis methods further verifies the advantages of the proposed model. This research offers a new solution for accurate fault diagnosis of gearboxes.

3.
Math Biosci Eng ; 21(4): 5007-5031, 2024 Mar 01.
Artigo em Inglês | MEDLINE | ID: mdl-38872524

RESUMO

In demanding application scenarios such as clinical psychotherapy and criminal interrogation, the accurate recognition of micro-expressions is of utmost importance but poses significant challenges. One of the main difficulties lies in effectively capturing weak and fleeting facial features and improving recognition performance. To address this fundamental issue, this paper proposed a novel architecture based on a multi-scale 3D residual convolutional neural network. The algorithm leveraged a deep 3D-ResNet50 as the skeleton model and utilized the micro-expression optical flow feature map as the input for the network model. Drawing upon the complex spatial and temporal features inherent in micro-expressions, the network incorporated multi-scale convolutional modules of varying sizes to integrate both global and local information. Furthermore, an attention mechanism feature fusion module was introduced to enhance the model's contextual awareness. Finally, to optimize the model's prediction of the optimal solution, a discriminative network structure with multiple output channels was constructed. The algorithm's performance was evaluated using the public datasets SMIC, SAMM, and CASME Ⅱ. The experimental results demonstrated that the proposed algorithm achieves recognition accuracies of 74.6, 84.77 and 91.35% on these datasets, respectively. This substantial improvement in efficiency compared to existing mainstream methods for extracting micro-expression subtle features effectively enhanced micro-expression recognition performance and increased the accuracy of high-precision micro-expression recognition. Consequently, this paper served as an important reference for researchers working on high-precision micro-expression recognition.


Assuntos
Algoritmos , Expressão Facial , Redes Neurais de Computação , Humanos , Imageamento Tridimensional/métodos , Face , Bases de Dados Factuais , Reconhecimento Automatizado de Padrão/métodos , Processamento de Imagem Assistida por Computador/métodos
4.
Ultrasound Med Biol ; 2024 Jun 22.
Artigo em Inglês | MEDLINE | ID: mdl-38910034

RESUMO

BACKGROUND: Ultrasound image examination has become the preferred choice for diagnosing metabolic dysfunction-associated steatotic liver disease (MASLD) due to its non-invasive nature. Computer-aided diagnosis (CAD) technology can assist doctors in avoiding deviations in the detection and classification of MASLD. METHOD: We propose a hybrid model that integrates the pre-trained VGG16 network with an attention mechanism and a stacking ensemble learning model, which is capable of multi-scale feature aggregation based on the self-attention mechanism and multi-classification model fusion (Logistic regression, random forest, support vector machine) based on stacking ensemble learning. The proposed hybrid method achieves four classifications of normal, mild, moderate, and severe fatty liver based on ultrasound images. RESULT AND CONCLUSION: Our proposed hybrid model reaches an accuracy of 91.34% and exhibits superior robustness against interference, which is better than traditional neural network algorithms. Experimental results show that, compared with the pre-trained VGG16 model, adding the self-attention mechanism improves the accuracy by 3.02%. Using the stacking ensemble learning model as a classifier further increases the accuracy to 91.34%, exceeding any single classifier such as LR (89.86%) and SVM (90.34%) and RF (90.73%). The proposed hybrid method can effectively improve the efficiency and accuracy of MASLD ultrasound image detection.

5.
Artif Intell Med ; 154: 102917, 2024 Jun 18.
Artigo em Inglês | MEDLINE | ID: mdl-38917599

RESUMO

Early detection of pneumoconiosis by routine health screening of workers in the mining industry is critical for preventing the progression of this incurable disease. Automated pneumoconiosis classification in chest X-ray images is challenging due to the low contrast of opacities, inter-class similarity, intra-class variation and the existence of artifacts. Compared to traditional methods, convolutional neural networks have shown significant improvement in pneumoconiosis classification tasks, however, accurate classification remains challenging due to mainly the inability to focus on semantically meaningful lesion opacities. Most existing networks focus on high level abstract information and ignore low level detailed object information. Different from natural images where an object occupies large space, the classification of pneumoconiosis depends on the density of small opacities inside the lung. To address this issue, we propose a novel two-stage adaptive multi-scale feature pyramid network called AMFP-Net for the diagnosis of pneumoconiosis from chest X-rays. The proposed model consists of 1) an adaptive multi-scale context block to extract rich contextual and discriminative information and 2) a weighted feature fusion module to effectively combine low level detailed and high level global semantic information. This two-stage network first segments the lungs to focus more on relevant regions by excluding irrelevant parts of the image, and then utilises the segmented lungs to classify pneumoconiosis into different categories. Extensive experiments on public and private datasets demonstrate that the proposed approach can outperform state-of-the-art methods for both segmentation and classification.

6.
Comput Biol Med ; 178: 108699, 2024 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-38870725

RESUMO

Accurate prediction of drug-target binding affinity (DTA) plays a pivotal role in drug discovery and repositioning. Although deep learning methods are widely used in DTA prediction, two significant challenges persist: (i) how to effectively represent the complex structural information of proteins and drugs; (ii) how to precisely model the mutual interactions between protein binding sites and key drug substructures. To address these challenges, we propose a MSFFDTA (Multi-scale feature fusion for predicting drug target affinity) model, in which multi-scale encoders effectively capture multi-level structural information of drugs and proteins are designed. And then a Selective Cross Attention (SCA) mechanism is developed to filter out the trivial interactions between drug-protein substructure pairs and retain the important ones, which will make the proposed model better focusing on these key interactions and offering insights into their underlying mechanism. Experimental results on two benchmark datasets demonstrate that MSFFDTA is superior to several state-of-the-art methods across almost all comparison metrics. Finally, we provide the ablation and case studies with visualizations to verify the effectiveness and the interpretability of MSFFDTA. The source code is freely available at https://github.com/whitehat32/MSFF-DTA/.


Assuntos
Proteínas , Proteínas/química , Proteínas/metabolismo , Descoberta de Drogas/métodos , Aprendizado Profundo , Preparações Farmacêuticas/metabolismo , Preparações Farmacêuticas/química , Humanos , Ligação Proteica , Sítios de Ligação , Biologia Computacional/métodos
7.
J Xray Sci Technol ; 2024 Jun 28.
Artigo em Inglês | MEDLINE | ID: mdl-38943423

RESUMO

BACKGROUND: Coronary artery segmentation is a prerequisite in computer-aided diagnosis of Coronary Artery Disease (CAD). However, segmentation of coronary arteries in Coronary Computed Tomography Angiography (CCTA) images faces several challenges. The current segmentation approaches are unable to effectively address these challenges and existing problems such as the need for manual interaction or low segmentation accuracy. OBJECTIVE: A Multi-scale Feature Learning and Rectification (MFLR) network is proposed to tackle the challenges and achieve automatic and accurate segmentation of coronary arteries. METHODS: The MFLR network introduces a multi-scale feature extraction module in the encoder to effectively capture contextual information under different receptive fields. In the decoder, a feature correction and fusion module is proposed, which employs high-level features containing multi-scale information to correct and guide low-level features, achieving fusion between the two-level features to further improve segmentation performance. RESULTS: The MFLR network achieved the best performance on the dice similarity coefficient, Jaccard index, Recall, F1-score, and 95% Hausdorff distance, for both in-house and public datasets. CONCLUSION: Experimental results demonstrate the superiority and good generalization ability of the MFLR approach. This study contributes to the accurate diagnosis and treatment of CAD, and it also informs other segmentation applications in medicine.

8.
Sci Rep ; 14(1): 11678, 2024 05 22.
Artigo em Inglês | MEDLINE | ID: mdl-38778219

RESUMO

Polyps are abnormal tissue clumps growing primarily on the inner linings of the gastrointestinal tract. While such clumps are generally harmless, they can potentially evolve into pathological tumors, and thus require long-term observation and monitoring. Polyp segmentation in gastrointestinal endoscopy images is an important stage for polyp monitoring and subsequent treatment. However, this segmentation task faces multiple challenges: the low contrast of the polyp boundaries, the varied polyp appearance, and the co-occurrence of multiple polyps. So, in this paper, an implicit edge-guided cross-layer fusion network (IECFNet) is proposed for polyp segmentation. The codec pair is used to generate an initial saliency map, the implicit edge-enhanced context attention module aggregates the feature graph output from the encoding and decoding to generate the rough prediction, and the multi-scale feature reasoning module is used to generate final predictions. Polyp segmentation experiments have been conducted on five popular polyp image datasets (Kvasir, CVC-ClinicDB, ETIS, CVC-ColonDB, and CVC-300), and the experimental results show that the proposed method significantly outperforms a conventional method, especially with an accuracy margin of 7.9% on the ETIS dataset.


Assuntos
Pólipos do Colo , Humanos , Pólipos do Colo/patologia , Pólipos do Colo/diagnóstico por imagem , Algoritmos , Processamento de Imagem Assistida por Computador/métodos , Redes Neurais de Computação , Interpretação de Imagem Assistida por Computador/métodos , Pólipos/patologia , Pólipos/diagnóstico por imagem , Endoscopia Gastrointestinal/métodos
9.
Sensors (Basel) ; 24(10)2024 May 15.
Artigo em Inglês | MEDLINE | ID: mdl-38794007

RESUMO

In recent years, deep learning methods have achieved remarkable success in hyperspectral image classification (HSIC), and the utilization of convolutional neural networks (CNNs) has proven to be highly effective. However, there are still several critical issues that need to be addressed in the HSIC task, such as the lack of labeled training samples, which constrains the classification accuracy and generalization ability of CNNs. To address this problem, a deep multi-scale attention fusion network (DMAF-NET) is proposed in this paper. This network is based on multi-scale features and fully exploits the deep features of samples from multiple levels and different perspectives with an aim to enhance HSIC results using limited samples. The innovation of this article is mainly reflected in three aspects: Firstly, a novel baseline network for multi-scale feature extraction is designed with a pyramid structure and densely connected 3D octave convolutional network enabling the extraction of deep-level information from features at different granularities. Secondly, a multi-scale spatial-spectral attention module and a pyramidal multi-scale channel attention module are designed, respectively. This allows modeling of the comprehensive dependencies of coordinates and directions, local and global, in four dimensions. Finally, a multi-attention fusion module is designed to effectively combine feature mappings extracted from multiple branches. Extensive experiments on four popular datasets demonstrate that the proposed method can achieve high classification accuracy even with fewer labeled samples.

10.
Entropy (Basel) ; 26(5)2024 May 20.
Artigo em Inglês | MEDLINE | ID: mdl-38785680

RESUMO

Traditional methods for pest recognition have certain limitations in addressing the challenges posed by diverse pest species, varying sizes, diverse morphologies, and complex field backgrounds, resulting in a lower recognition accuracy. To overcome these limitations, this paper proposes a novel pest recognition method based on attention mechanism and multi-scale feature fusion (AM-MSFF). By combining the advantages of attention mechanism and multi-scale feature fusion, this method significantly improves the accuracy of pest recognition. Firstly, we introduce the relation-aware global attention (RGA) module to adaptively adjust the feature weights of each position, thereby focusing more on the regions relevant to pests and reducing the background interference. Then, we propose the multi-scale feature fusion (MSFF) module to fuse feature maps from different scales, which better captures the subtle differences and the overall shape features in pest images. Moreover, we introduce generalized-mean pooling (GeMP) to more accurately extract feature information from pest images and better distinguish different pest categories. In terms of the loss function, this study proposes an improved focal loss (FL), known as balanced focal loss (BFL), as a replacement for cross-entropy loss. This improvement aims to address the common issue of class imbalance in pest datasets, thereby enhancing the recognition accuracy of pest identification models. To evaluate the performance of the AM-MSFF model, we conduct experiments on two publicly available pest datasets (IP102 and D0). Extensive experiments demonstrate that our proposed AM-MSFF outperforms most state-of-the-art methods. On the IP102 dataset, the accuracy reaches 72.64%, while on the D0 dataset, it reaches 99.05%.

11.
Comput Methods Programs Biomed ; 251: 108218, 2024 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-38728828

RESUMO

BACKGROUND: Virtual reality motion sickness (VRMS) is a key issue hindering the development of virtual reality technology, and accurate detection of its occurrence is the first prerequisite for solving the issue. OBJECTIVE: In this paper, a convolutional neural network (CNN) EEG detection model based on multi-scale feature correlation is proposed for detecting VRMS. METHODS: The model uses multi-scale 1D convolutional layers to extract multi-scale temporal features from the multi-lead EEG data, and then calculates the feature correlations of the extracted multi-scale features among all the leads to form the feature adjacent matrixes, which converts the time-domain features to correlation-based brain network features, thus strengthen the feature representation. Finally, the correlation features of each layer are fused. The fused features are then fed into the channel attention module to filter the channels and classify them using a fully connected network. Finally, we recruit subjects to experience 6 different modes of virtual roller coaster scenes, and collect resting EEG data before and after the task to verify the model. RESULTS: The results show that the accuracy, precision, recall and F1-score of this model for the recognition of VRMS are 98.66 %, 98.65 %, 98.68 %, and 98.66 %, respectively. The proposed model outperforms the current classic and advanced EEG recognition models. SIGNIFICANCE: It shows that this model can be used for the recognition of VRMS based on the resting state EEG.


Assuntos
Eletroencefalografia , Enjoo devido ao Movimento , Redes Neurais de Computação , Realidade Virtual , Humanos , Eletroencefalografia/métodos , Enjoo devido ao Movimento/fisiopatologia , Algoritmos , Masculino , Adulto , Feminino
12.
Heliyon ; 10(10): e31228, 2024 May 30.
Artigo em Inglês | MEDLINE | ID: mdl-38803883

RESUMO

Diabetic foot ulcer (DFU) poses a significant threat to individuals affected by diabetes, often leading to limb amputation. Early detection of DFU can greatly improve the chances of survival for diabetic patients. This work introduces FusionNet, a novel multi-scale feature fusion network designed to accurately differentiate DFU skin from healthy skin using multiple pre-trained convolutional neural network (CNN) algorithms. A dataset comprising 6963 skin images (3574 healthy and 3389 ulcer) from various patients was divided into training (6080 images), validation (672 images), and testing (211 images) sets. Initially, three image preprocessing techniques - Gaussian filter, median filter, and motion blur estimation - were applied to eliminate irrelevant, noisy, and blurry data. Subsequently, three pre-trained CNN algorithms -DenseNet201, VGG19, and NASNetMobile - were utilized to extract high-frequency features from the input images. These features were then inputted into a meta-tuner module to predict DFU by selecting the most discriminative features. Statistical tests, including Friedman and analysis of variance (ANOVA), were employed to identify significant differences between FusionNet and other sub-networks. Finally, three eXplainable Artificial Intelligence (XAI) algorithms - SHAP (SHapley Additive exPlanations), LIME (Local Interpretable Model-agnostic Explanations), and Grad-CAM (Gradient-weighted Class Activation Mapping) - were integrated into FusionNet to enhance transparency and explainability. The FusionNet classifier achieved exceptional classification results with 99.05 % accuracy, 98.18 % recall, 100.00 % precision, 99.09 % AUC, and 99.08 % F1 score. We believe that our proposed FusionNet will be a valuable tool in the medical field to distinguish DFU from healthy skin.

13.
PeerJ Comput Sci ; 10: e1959, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38660160

RESUMO

With the development of generative model, the cost of facial manipulation and forgery is becoming lower and lower. Fraudulent data has brought numerous hidden threats in politics, privacy, and cybersecurity. Although many methods of face forgery detection focus on the learning of high frequency forgery traces and achieve promising performance, these methods usually learn features in spatial and frequency independently. In order to combine the information of the two domains, a combined spatial and frequency dual stream network is proposed for face forgery detection. Concretely, a cross self-attention (CSA) module is designed to improve frequency feature interaction and fusion at different scales. Moreover, to augment the semantic and contextual information, frequency guided spatial feature extraction module is proposed to extract and reconstruct the spatial information. These two modules deeply mine the forgery traces via a dual-stream collaborative network. Through comprehensive experiments on different datasets, we demonstrate the effectiveness of proposed method for both within and cross datasets.

14.
Front Plant Sci ; 15: 1382802, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38654901

RESUMO

When detecting tomato leaf diseases in natural environments, factors such as changes in lighting, occlusion, and the small size of leaf lesions pose challenges to detection accuracy. Therefore, this study proposes a tomato leaf disease detection method based on attention mechanisms and multi-scale feature fusion. Firstly, the Convolutional Block Attention Module (CBAM) is introduced into the backbone feature extraction network to enhance the ability to extract lesion features and suppress the effects of environmental interference. Secondly, shallow feature maps are introduced into the re-parameterized generalized feature pyramid network (RepGFPN), constructing a new multi-scale re-parameterized generalized feature fusion module (BiRepGFPN) to enhance feature fusion expression and improve the localization ability for small lesion features. Finally, the BiRepGFPN replaces the Path Aggregation Feature Pyramid Network (PAFPN) in the YOLOv6 model to achieve effective fusion of deep semantic and shallow spatial information. Experimental results indicate that, when evaluated on the publicly available PlantDoc dataset, the model's mean average precision (mAP) showed improvements of 7.7%, 11.8%, 3.4%, 5.7%, 4.3%, and 2.6% compared to YOLOX, YOLOv5, YOLOv6, YOLOv6-s, YOLOv7, and YOLOv8, respectively. When evaluated on the tomato leaf disease dataset, the model demonstrated a precision of 92.9%, a recall rate of 95.2%, an F1 score of 94.0%, and a mean average precision (mAP) of 93.8%, showing improvements of 2.3%, 4.0%, 3.1%, and 2.7% respectively compared to the baseline model. These results indicate that the proposed detection method possesses significant detection performance and generalization capabilities.

15.
Foods ; 13(6)2024 Mar 19.
Artigo em Inglês | MEDLINE | ID: mdl-38540915

RESUMO

As a traditional delicacy in China, preserved eggs inevitably experience instances of substandard quality during the production process. Chinese preserved egg production facilities can only rely on experienced workers to select the preserved eggs. However, the manual selection of preserved eggs presents challenges such as a low efficiency, subjective judgments, high costs, and hindered industrial production processes. In response to these challenges, this study procured the transmitted imagery of preserved eggs and refined the ConvNeXt network across four pivotal dimensions: the dimensionality reduction of model feature maps, the integration of multi-scale feature fusion (MSFF), the incorporation of a global attention mechanism (GAM) module, and the amalgamation of the cross-entropy loss function with focal loss. The resultant refined model, ConvNeXt_PEgg, attained proficiency in classifying and grading preserved eggs. Notably, the improved model achieved a classification accuracy of 92.6% across the five categories of preserved eggs, with a grading accuracy of 95.9% spanning three levels. Moreover, in contrast to its predecessor, the refined model witnessed a 24.5% reduction in the parameter volume, alongside a 3.2 percentage point augmentation in the classification accuracy and a 2.8 percentage point boost in the grading accuracy. Through meticulous comparative analysis, each enhancement exhibited varying degrees of performance elevation. Evidently, the refined model outshone a plethora of classical models, underscoring its efficacy in discerning the internal quality of preserved eggs. With its potential for real-world implementation, this technology portends to heighten the economic viability of manufacturing facilities.

16.
Sensors (Basel) ; 24(5)2024 Mar 04.
Artigo em Inglês | MEDLINE | ID: mdl-38475200

RESUMO

Casting defects in turbine blades can significantly reduce an aero-engine's service life and cause secondary damage to the blades when exposed to harsh environments. Therefore, casting defect detection plays a crucial role in enhancing aircraft performance. Existing defect detection methods face challenges in effectively detecting multi-scale defects and handling imbalanced datasets, leading to unsatisfactory defect detection results. In this work, a novel blade defect detection method is proposed. This method is based on a detection transformer with a multi-scale fusion attention mechanism, considering comprehensive features. Firstly, a novel joint data augmentation (JDA) method is constructed to alleviate the imbalanced dataset issue by effectively increasing the number of sample data. Then, an attention-based channel-adaptive weighting (ACAW) feature enhancement module is established to fully apply complementary information among different feature channels, and further refine feature representations. Consequently, a multi-scale feature fusion (MFF) module is proposed to integrate high-dimensional semantic information and low-level representation features, enhancing multi-scale defect detection precision. Moreover, R-Focal loss is developed in an MFF attention-based DEtection TRansformer (DETR) to further solve the issue of imbalanced datasets and accelerate model convergence using the random hyper-parameters search strategy. An aero-engine turbine blade defect X-ray (ATBDX) image dataset is applied to validate the proposed method. The comparative results demonstrate that this proposed method can effectively integrate multi-scale image features and enhance multi-scale defect detection precision.

17.
Comput Med Imaging Graph ; 113: 102354, 2024 04.
Artigo em Inglês | MEDLINE | ID: mdl-38341946

RESUMO

Lung granuloma is a very common lung disease, and its specific diagnosis is important for determining the exact cause of the disease as well as the prognosis of the patient. And, an effective lung granuloma detection model based on computer-aided diagnostics (CAD) can help pathologists to localize granulomas, thereby improving the efficiency of the specific diagnosis. However, for lung granuloma detection models based on CAD, the significant size differences between granulomas and how to better utilize the morphological features of granulomas are both critical challenges to be addressed. In this paper, we propose an automatic method CRDet to localize granulomas in histopathological images and deal with these challenges. We first introduce the multi-scale feature extraction network with self-attention to extract features at different scales at the same time. Then, the features will be converted to circle representations of granulomas by circle representation detection heads to achieve the alignment of features and ground truth. In this way, we can also more effectively use the circular morphological features of granulomas. Finally, we propose a center point calibration method at the inference stage to further optimize the circle representation. For model evaluation, we built a lung granuloma circle representation dataset named LGCR, including 288 images from 50 subjects. Our method yielded 0.316 mAP and 0.571 mAR, outperforming the state-of-the-art object detection methods on our proposed LGCR.


Assuntos
Granuloma , Pulmão , Humanos , Calibragem , Granuloma/diagnóstico por imagem , Granuloma/patologia , Pulmão/diagnóstico por imagem , Pulmão/patologia
18.
Math Biosci Eng ; 21(1): 49-74, 2024 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-38303413

RESUMO

Retinal vessel segmentation is very important for diagnosing and treating certain eye diseases. Recently, many deep learning-based retinal vessel segmentation methods have been proposed; however, there are still many shortcomings (e.g., they cannot obtain satisfactory results when dealing with cross-domain data or segmenting small blood vessels). To alleviate these problems and avoid overly complex models, we propose a novel network based on a multi-scale feature and style transfer (MSFST-NET) for retinal vessel segmentation. Specifically, we first construct a lightweight segmentation module named MSF-Net, which introduces the selective kernel (SK) module to increase the multi-scale feature extraction ability of the model to achieve improved small blood vessel segmentation. Then, to alleviate the problem of model performance degradation when segmenting cross-domain datasets, we propose a style transfer module and a pseudo-label learning strategy. The style transfer module is used to reduce the style difference between the source domain image and the target domain image to improve the segmentation performance for the target domain image. The pseudo-label learning strategy is designed to be combined with the style transfer module to further boost the generalization ability of the model. Moreover, we trained and tested our proposed MSFST-NET in experiments on the DRIVE and CHASE_DB1 datasets. The experimental results demonstrate that MSFST-NET can effectively improve the generalization ability of the model on cross-domain datasets and achieve improved retinal vessel segmentation results than other state-of-the-art methods.


Assuntos
Processamento de Imagem Assistida por Computador , Vasos Retinianos , Vasos Retinianos/diagnóstico por imagem , Algoritmos
19.
Sensors (Basel) ; 24(3)2024 Feb 04.
Artigo em Inglês | MEDLINE | ID: mdl-38339726

RESUMO

The precise building extraction from high-resolution remote sensing images holds significant application for urban planning, resource management, and environmental conservation. In recent years, deep neural networks (DNNs) have garnered substantial attention for their adeptness in learning and extracting features, becoming integral to building extraction methodologies and yielding noteworthy performance outcomes. Nonetheless, prevailing DNN-based models for building extraction often overlook spatial information during the feature extraction phase. Additionally, many existing models employ a simplistic and direct approach in the feature fusion stage, potentially leading to spurious target detection and the amplification of internal noise. To address these concerns, we present a multi-scale attention network (MSANet) tailored for building extraction from high-resolution remote sensing images. In our approach, we initially extracted multi-scale building feature information, leveraging the multi-scale channel attention mechanism and multi-scale spatial attention mechanism. Subsequently, we employed adaptive hierarchical weighting processes on the extracted building features. Concurrently, we introduced a gating mechanism to facilitate the effective fusion of multi-scale features. The efficacy of the proposed MSANet was evaluated using the WHU aerial image dataset and the WHU satellite image dataset. The experimental results demonstrate compelling performance metrics, with the F1 scores registering at 93.76% and 77.64% on the WHU aerial imagery dataset and WHU satellite dataset II, respectively. Furthermore, the intersection over union (IoU) values stood at 88.25% and 63.46%, surpassing benchmarks set by DeepLabV3 and GSMC.

20.
Heliyon ; 10(4): e26182, 2024 Feb 29.
Artigo em Inglês | MEDLINE | ID: mdl-38420439

RESUMO

Traffic sign recognition is an important part of intelligent transportation system. It uses computer vision and traffic sign recognition technology to detect and recognize traffic signs on the road automatically. In this paper, we propose a lightweight model for traffic sign recognition based on convolutional neural networks called ConvNeSe. Firstly, the feature extraction module of the model is constructed using the Depthwise Separable Convolution and Inverted Residuals structures. The model extracts multi-scale features with strong representation ability by optimizing the structure of convolutional neural networks and fusing of features. Then, the model introduces Squeeze and Excitation Block (SE Block) to improve the attention to important features, which can capture key information of traffic sign images. Finally, the accuracy of the model in the German Traffic Sign Recognition Benchmark Database (GTSRB) is 99.85%. At the same time, the model has good robustness according to the results of ablation experiments.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...