RÉSUMÉ
ObjectiveTo realize the automatic recognition of the slicing angles of Fritillariae Thunbergii Bulbus (FTB) based on the improved YOLOv7-tiny algorithm. MethodFirstly, a diverse dataset of FTB images, totaling 16 000 pictures, with various angles was constructed. Furthermore, improvements were made to YOLOv7-tiny by replacing standard convolutions with ghost convolution (GhostConv), incorporating the coordinate attention (CA) mechanism as a preferred addition, substituting some activation functions with HardSwish function for decreasing the floating point operations. Additionally, a penalty term for angle recognition error was integrated into the loss function, and modifications were made to the non-maximum suppression (NMS) strategy to address cases where multiple detection results were associated with the same target. In order to verify the effectiveness of different improvement points on the optimization of the algorithm model, ablation experiments were carried out on all the improvement points, and the effectiveness of the improvement points was proved by comparing the prediction results before and after the addition of a certain improvement point on the basis of the original model or the model with the addition of an improvement point that has been verified to be effective, in order to evaluate the improvement of the indexes. ResultThe number of parameters required for the improved slicing angle recognition algorithm of FTB was about 55.4% of the original algorithm, and the amount of computation was about 59.4% of the original algorithm. The mAP@0.5[mean average precision at an intersection over union(IoU) of 0.5] increased by 12.2%, the mean absolute error(MAE) of the recognized angle was 5.02°, representing a reduction of 4.58° compared to the original algorithm. In the experimental environment of this paper, the average recognition time per image was as low as 8.7 ms, significantly faster than the average human reaction time. ConclusionThis study, by utilizing the improved YOLOv7-tiny algorithm, achieves effective slicing angle recognition of FTB with high accuracy and more lightweight, which provides a novel approach for stable and precise automated slicing of FTB, thereby providing valuable insights into the automation of processing other traditional Chinese medicines.
RÉSUMÉ
Objective The objective of this study is to improve the accuracy of automatic identification in complex background herbal slice images.The goal is to achieve accurate recognition of herbal slice images in the presence of complex backgrounds.Methods The experiment was conducted on a collected and organized dataset of Tibetan herbal slice images.The RGB,HOG,and LBP features of the slices were analyzed.An improved HOG algorithm was used to fuse multiple features,and a deep learning network was utilized for image recognition.Results The proposed method of multi-feature fusion combined with deep learning achieved an identification accuracy of 91.68%on 3610 Tibetan herbal slice images with complex backgrounds.Furthermore,the average identification accuracy for 20 common traditional Chinese medicine slices,such as Chuan Beimu,Hawthorn,and Pinellia,reached 98.00%.This method outperformed existing methods for identifying herbal slices in complex backgrounds,indicating its feasibility and wide applicability for the identification of other traditional Chinese herbal medicines.Conclusion The fusion of multiple features effectively captures distinguishing characteristics of herbal slices in complex backgrounds.It exhibits high recognition rates for Tibetan herbal slices with complex and heavily occluded backgrounds,and can be successfully applied to the recognition of natural scene-based traditional Chinese herbal medicines and herbal slices.This approach shows promising prospects for practical applications.
RÉSUMÉ
Purpose/Significance Achieving automatic generation of medical imaging reports is important for reducing the workload of radiologists and promoting the standardization of clinical workflow.Method/Process Focusing on finding the chest report generation mod-els with open source code in recent years,the paper develops an automatic medical image report generation method based on the CDGPT2 model.Result/Conclusion The advantages of the model in report generation are still to be explored,the quality of reports generated after modifications to the decoder inputs of the model is not high.Future research could improve the performance of the model by using large datasets and incorporating more clinical information.
RÉSUMÉ
A U-Net incorporating improved Transformer and convolutional channel attention module is designed for biventricular segmentation in MRI image.By replacing the high-level convolution of U-Net with the improved Transformer,the global feature information can be effectively extracted to cope with the challenge of poor segmentation performance due to the complex morphological variation of the right ventricle.The improved Transformer incorporates a fixed window attention for position localization in the self-attention module,and aggregates the output feature map for reducing the feature map size;and the network learning capability is improved by increasing network depth through the adjustment of multilayer perceptron.To solve the problem of unsatisfactory segmentation performance caused by blurred tissue edges,a feature aggregation module is used for the fusion of multi-level underlying features,and a convolutional channel attention module is adopted to rescale the underlying features to achieve adaptive learning of feature weights.In addition,a plug-and-play feature enhancement module is integrated to improve the segmentation performance which is affected by feature loss due to channel decay in the codec structure,which guarantees the spatial information while increasing the proportion of useful channel information.The test on the ACDC dataset shows that the proposed method has higher biventricular segmentation accuracy,especially for the right ventricle segmentation.Compared with other methods,the proposed method improves the DSC coefficient by at least 2.83%,proving its effectiveness in biventricular segmentation.
RÉSUMÉ
AIM:To explore the use of attention mechanism and Pix2Pix generative adversarial network to predict the postoperative corneal topography of age-related cataract patients undergone femtosecond laser arcuate keratotomy.METHODS:In this retrospective case series study, the 210 preoperative and postoperative corneal topographies from 87 age-related cataract patients(105 eyes)undergoing femtosecond laser arcuate keratotomy at Shanxi Eye Hospital between March 2018 and March 2020 were selected and divided into a training set(180)and a test set(30)for model training and testing. The peak signal-to-noise ratio(PSNR), structural similarity(SSIM)and Alpins astigmatism vector analysis were used to compare the accuracy of postoperative corneal topography prediction under different attention mechanisms.RESULTS:The model based on attention mechanism and Pix2Pix network can predict postoperative corneal topography, among which the model based on Self-Attention mechanism has the best prediction effect, with PSNR and SSIM reaching 16.048 and 0.7661, respectively. There were no statistically significant differences in the difference vector, difference vector axis position, surgically induced astigmatism, and correction index between real and generated corneal topography on the 3mm and 5mm rings(all P>0.05).CONCLUSION:Based on the Self-Attention mechanism and Pix2Pix network, the postoperative corneal topography can be well predicted, which can provide reference for the surgical planning and postoperative effects of ophthalmic clinicians.
RÉSUMÉ
OBJECTIVE@#To investigate the consistency and diagnostic performance of magnetic resonance imaging (MRI) for detecting microvascular invasion (MVI) of hepatocellular carcinoma (HCC) and the validity of deep learning attention mechanisms and clinical features for MVI grade prediction.@*METHODS@#This retrospective study was conducted among 158 patients with HCC treated in Shunde Hospital Affiliated to Southern Medical University between January, 2017 and February, 2020. The imaging data and clinical data of the patients were collected to establish single sequence deep learning models and fusion models based on the EfficientNetB0 and attention modules. The imaging data included conventional MRI sequences (T1WI, T2WI, and DWI), enhanced MRI sequences (AP, PP, EP, and HBP) and synthesized MRI sequences (T1mapping-pre and T1mapping-20 min), and the high-risk areas of MVI were visualized using deep learning visualization techniques.@*RESULTS@#The fusion model based on T1mapping-20min sequence and clinical features outperformed other fusion models with an accuracy of 0.8376, a sensitivity of 0.8378, a specificity of 0.8702, and an AUC of 0.8501 for detecting MVI. The deep fusion models were also capable of displaying the high-risk areas of MVI.@*CONCLUSION@#The fusion models based on multiple MRI sequences can effectively detect MVI in patients with HCC, demonstrating the validity of deep learning algorithm that combines attention mechanism and clinical features for MVI grade prediction.
Sujet(s)
Humains , Carcinome hépatocellulaire , Études rétrospectives , Tumeurs du foie , Imagerie par résonance magnétique , AlgorithmesRÉSUMÉ
Polysomnography (PSG) monitoring is an important method for clinical diagnosis of diseases such as insomnia, apnea and so on. In order to solve the problem of time-consuming and energy-consuming sleep stage staging of sleep disorder patients using manual frame-by-frame visual judgment PSG, this study proposed a deep learning algorithm model combining convolutional neural networks (CNN) and bidirectional gate recurrent neural networks (Bi GRU). A dynamic sparse self-attention mechanism was designed to solve the problem that gated recurrent neural networks (GRU) is difficult to obtain accurate vector representation of long-distance information. This study collected 143 overnight PSG data of patients from Shanghai Mental Health Center with sleep disorders, which were combined with 153 overnight PSG data of patients from the open-source dataset, and selected 9 electrophysiological channel signals including 6 electroencephalogram (EEG) signal channels, 2 electrooculogram (EOG) signal channels and a single mandibular electromyogram (EMG) signal channel. These data were used for model training, testing and evaluation. After cross validation, the accuracy was (84.0±2.0)%, and Cohen's kappa value was 0.77±0.50. It showed better performance than the Cohen's kappa value of physician score of 0.75±0.11. The experimental results show that the algorithm model in this paper has a high staging effect in different populations and is widely applicable. It is of great significance to assist clinicians in rapid and large-scale PSG sleep automatic staging.
Sujet(s)
Humains , Polysomnographie , Chine , Phases du sommeil , Sommeil , AlgorithmesRÉSUMÉ
Accurate segmentation of whole slide images is of great significance for the diagnosis of pancreatic cancer. However, developing an automatic model is challenging due to the complex content, limited samples, and high sample heterogeneity of pathological images. This paper presented a multi-tissue segmentation model for whole slide images of pancreatic cancer. We introduced an attention mechanism in building blocks, and designed a multi-task learning framework as well as proper auxiliary tasks to enhance model performance. The model was trained and tested with the pancreatic cancer pathological image dataset from Shanghai Changhai Hospital. And the data of TCGA, as an external independent validation cohort, was used for external validation. The F1 scores of the model exceeded 0.97 and 0.92 in the internal dataset and external dataset, respectively. Moreover, the generalization performance was also better than the baseline method significantly. These results demonstrate that the proposed model can accurately segment eight kinds of tissue regions in whole slide images of pancreatic cancer, which can provide reliable basis for clinical diagnosis.
Sujet(s)
Humains , Chine , Tumeurs du pancréas/imagerie diagnostique , ApprentissageRÉSUMÉ
Alzheimer's disease (AD) is a progressive and irreversible neurodegenerative disease. Neuroimaging based on magnetic resonance imaging (MRI) is one of the most intuitive and reliable methods to perform AD screening and diagnosis. Clinical head MRI detection generates multimodal image data, and to solve the problem of multimodal MRI processing and information fusion, this paper proposes a structural and functional MRI feature extraction and fusion method based on generalized convolutional neural networks (gCNN). The method includes a three-dimensional residual U-shaped network based on hybrid attention mechanism (3D HA-ResUNet) for feature representation and classification for structural MRI, and a U-shaped graph convolutional neural network (U-GCN) for node feature representation and classification of brain functional networks for functional MRI. Based on the fusion of the two types of image features, the optimal feature subset is selected based on discrete binary particle swarm optimization, and the prediction results are output by a machine learning classifier. The validation results of multimodal dataset from the AD Neuroimaging Initiative (ADNI) open-source database show that the proposed models have superior performance in their respective data domains. The gCNN framework combines the advantages of these two models and further improves the performance of the methods using single-modal MRI, improving the classification accuracy and sensitivity by 5.56% and 11.11%, respectively. In conclusion, the gCNN-based multimodal MRI classification method proposed in this paper can provide a technical basis for the auxiliary diagnosis of Alzheimer's disease.
Sujet(s)
Humains , Maladie d'Alzheimer/imagerie diagnostique , Maladies neurodégénératives , Imagerie par résonance magnétique/méthodes , 29935 , Neuroimagerie/méthodes , Dysfonctionnement cognitif/diagnosticRÉSUMÉ
Magnetic resonance (MR) imaging is an important tool for prostate cancer diagnosis, and accurate segmentation of MR prostate regions by computer-aided diagnostic techniques is important for the diagnosis of prostate cancer. In this paper, we propose an improved end-to-end three-dimensional image segmentation network using a deep learning approach to the traditional V-Net network (V-Net) network in order to provide more accurate image segmentation results. Firstly, we fused the soft attention mechanism into the traditional V-Net's jump connection, and combined short jump connection and small convolutional kernel to further improve the network segmentation accuracy. Then the prostate region was segmented using the Prostate MR Image Segmentation 2012 (PROMISE 12) challenge dataset, and the model was evaluated using the dice similarity coefficient (DSC) and Hausdorff distance (HD). The DSC and HD values of the segmented model could reach 0.903 and 3.912 mm, respectively. The experimental results show that the algorithm in this paper can provide more accurate three-dimensional segmentation results, which can accurately and efficiently segment prostate MR images and provide a reliable basis for clinical diagnosis and treatment.
Sujet(s)
Mâle , Humains , Prostate/imagerie diagnostique , Traitement d'image par ordinateur/méthodes , Imagerie par résonance magnétique/méthodes , Imagerie tridimensionnelle/méthodes , Tumeurs de la prostate/imagerie diagnostiqueRÉSUMÉ
The brain-computer interface (BCI) based on motor imagery electroencephalography (MI-EEG) enables direct information interaction between the human brain and external devices. In this paper, a multi-scale EEG feature extraction convolutional neural network model based on time series data enhancement is proposed for decoding MI-EEG signals. First, an EEG signals augmentation method was proposed that could increase the information content of training samples without changing the length of the time series, while retaining its original features completely. Then, multiple holistic and detailed features of the EEG data were adaptively extracted by multi-scale convolution module, and the features were fused and filtered by parallel residual module and channel attention. Finally, classification results were output by a fully connected network. The application experimental results on the BCI Competition IV 2a and 2b datasets showed that the proposed model achieved an average classification accuracy of 91.87% and 87.85% for the motor imagery task, respectively, which had high accuracy and strong robustness compared with existing baseline models. The proposed model does not require complex signals pre-processing operations and has the advantage of multi-scale feature extraction, which has high practical application value.
Sujet(s)
Humains , Facteurs temps , Encéphale , Électroencéphalographie , 32721 , 29935RÉSUMÉ
In the diagnosis of cardiovascular diseases, the analysis of electrocardiogram (ECG) signals has always played a crucial role. At present, how to effectively identify abnormal heart beats by algorithms is still a difficult task in the field of ECG signal analysis. Based on this, a classification model that automatically identifies abnormal heartbeats based on deep residual network (ResNet) and self-attention mechanism was proposed. Firstly, this paper designed an 18-layer convolutional neural network (CNN) based on the residual structure, which helped model fully extract the local features. Then, the bi-directional gated recurrent unit (BiGRU) was used to explore the temporal correlation for further obtaining the temporal features. Finally, the self-attention mechanism was built to weight important information and enhance model's ability to extract important features, which helped model achieve higher classification accuracy. In addition, in order to mitigate the interference on classification performance due to data imbalance, the study utilized multiple approaches for data augmentation. The experimental data in this study came from the arrhythmia database constructed by MIT and Beth Israel Hospital (MIT-BIH), and the final results showed that the proposed model achieved an overall accuracy of 98.33% on the original dataset and 99.12% on the optimized dataset, which demonstrated that the proposed model can achieve good performance in ECG signal classification, and possessed potential value for application to portable ECG detection devices.
Sujet(s)
Humains , Électrocardiographie , Algorithmes , Maladies cardiovasculaires , Bases de données factuelles , 29935RÉSUMÉ
@#Objective To recognize the different phases of Korotkoff sounds through deep learning technology, so as to improve the accuracy of blood pressure measurement in different populations. Methods A classification model of the Korotkoff sounds phases was designed, which fused attention mechanism (Attention), residual network (ResNet) and bidirectional long short-term memory (BiLSTM). First, a single Korotkoff sound signal was extracted from the whole Korotkoff sounds signals beat by beat, and each Korotkoff sound signal was converted into a Mel spectrogram. Then, the local feature extraction of Mel spectrogram was processed by using the Attention mechanism and ResNet network, and BiLSTM network was used to deal with the temporal relations between features, and full-connection layer network was applied in reducing the dimension of features. Finally, the classification was completed by SoftMax function. The dataset used in this study was collected from 44 volunteers (24 females, 20 males with an average age of 36 years), and the model performance was verified using 10-fold cross-validation. Results The classification accuracy of the established model for the 5 types of Korotkoff sounds phases was 93.4%, which was higher than that of other models. Conclusion This study proves that the deep learning method can accurately classify Korotkoff sounds phases, which lays a strong technical foundation for the subsequent design of automatic blood pressure measurement methods based on the classification of the Korotkoff sounds phases.
RÉSUMÉ
Accurate segmentation of pediatric echocardiograms is a challenging task, because significant heart-size changes with age and faster heart rate lead to more blurred boundaries on cardiac ultrasound images compared with adults. To address these problems, a dual decoder network model combining channel attention and scale attention is proposed in this paper. Firstly, an attention-guided decoder with deep supervision strategy is used to obtain attention maps for the ventricular regions. Then, the generated ventricular attention is fed back to multiple layers of the network through skip connections to adjust the feature weights generated by the encoder and highlight the left and right ventricular areas. Finally, a scale attention module and a channel attention module are utilized to enhance the edge features of the left and right ventricles. The experimental results demonstrate that the proposed method in this paper achieves an average Dice coefficient of 90.63% in acquired bilateral ventricular segmentation dataset, which is better than some conventional and state-of-the-art methods in the field of medical image segmentation. More importantly, the method has a more accurate effect in segmenting the edge of the ventricle. The results of this paper can provide a new solution for pediatric echocardiographic bilateral ventricular segmentation and subsequent auxiliary diagnosis of congenital heart disease.
Sujet(s)
Adulte , Humains , Enfant , Ventricules cardiaques/imagerie diagnostique , Échocardiographie , Traitement d'image par ordinateurRÉSUMÉ
Eyelid tumor is a serious eye disease that leads to vision loss or even blindness.The similarity between benign and malignant characteristics makes it difficult for ophthalmologists lacking clinical experience to distinguish between them.To address the problem,a method(ResNet101_CBAM)based on two-stage target localization using fully convolutional one-stage object detection(FCOS)and residual network incorporating a dual attention mechanism is proposed to realize the automatic diagnosis of benign and malignant eyelid tumors.FCOS is used to automatically localize the overall contour of the orbit,removing the background and surrounding noise,and then finely localize the tumor lesion inside the orbit.The obtained lesion region is input into ResNet101_CBAM for the automatic diagnosis of benign and malignant eyelid tumors.The experimental results show that the average precision of the target localization algorithm for tumor lesion is 0.821,and that compared with ResNet101,ResNet101_CBAM improves the sensitivity and accuracy in eyelid tumor classification by 4.7%and 3.0%,respectively,indicating that the proposed model has superior performances in the automatic diagnosis of benign and malignant eyelid tumors.
RÉSUMÉ
In view of numerous subtle features in fundus disease images,small sample sizes,and difficulties in diagnosis,both deep learning and medical imaging technologies are used to develop a fundus disease diagnosis model that integrates multi-scale features and hybrid domain attention mechanism.Resnet50 network is taken as the baseline network,and it is modified in the study.The method uses parallel multi-branch architecture to extract the features of fundus diseases under different receptive fields for effectively improving the feature extraction ability and computational efficiency,and adopts hybrid domain attention mechanism to select information that is more critical to the current task for effectively enhancing the classification performance.The test on ODIR dataset shows that the proposed method has a diagnostic accuracy of 93.2%for different fundus diseases,which is 5.2%higher than the baseline network,demonstrating a good diagnostic performance.
RÉSUMÉ
A fall detection algorithm for community healthcare is proposed to avoid the secondary injury caused by untimely treatment when the elder living alone falls in the community.The algorithm has two branches,namely 2D convolution and 3D convolution,which allow it can extract spatial and temporal features simultaneously.The dense connections added in the 3D branch enhance the ability to extract temporal features;the residual blocks in the 2D branch are redesigned to improve the ability of spatial feature extraction;and a non-local attention mechanism is introduced to the branch fusion for better feature fusion.The algorithm also takes scene information into consideration,and it is supervised by SIoU loss function and the combined loss function to realize fall detection.The experiment on the expanded public URFD dataset reveals that the proposed method has a detection accuracy of 98.3%,which verifies its performance and robustness for fall detection.
RÉSUMÉ
To address the issues in the current lung nodule detection for tuberculosis where the existing object detection algorithms have limited precision for small nodules and often predict bounding box locations inaccurately,a lung nodule detection method based on YOLOv7 is presented for obtaining small lung nodules more effectively and realizing the continuous convergence of target detection box.Based on the framework of YOLOv7 network model,the improvements are made in the following 3 aspects.(1)The cross-channel information and target airspace information are obtained with the effective SimAM channel attention mechanism embed in the Head network,so as to highlight the target features and enable the model to identify the regions of interest more accurately.(2)SIOU boundary loss function is used to increase the angle cost on the original loss function,and redefine the distance cost and shape cost to improve the convergence rate and reduce the loss value.(3)SIOU-NMS is used to replace the non-maximum suppression algorithm for reducing the error suppression due to target occlusion.The results of experiments on a custom lung nodule dataset show that compared with the original YOLOv7,the proposed method improves accuracy and recall rate by 2.9%and 3.1%,and the mean average precision at a confidence threshold of 0.5 is increased by 3.7%.The model can effectively assist in the diagnosis of lung nodules.
RÉSUMÉ
Glioma is a primary brain tumor with high incidence rate. High-grade gliomas (HGG) are those with the highest degree of malignancy and the lowest degree of survival. Surgical resection and postoperative adjuvant chemoradiotherapy are often used in clinical treatment, so accurate segmentation of tumor-related areas is of great significance for the treatment of patients. In order to improve the segmentation accuracy of HGG, this paper proposes a multi-modal glioma semantic segmentation network with multi-scale feature extraction and multi-attention fusion mechanism. The main contributions are, (1) Multi-scale residual structures were used to extract features from multi-modal gliomas magnetic resonance imaging (MRI); (2) Two types of attention modules were used for features aggregating in channel and spatial; (3) In order to improve the segmentation performance of the whole network, the branch classifier was constructed using ensemble learning strategy to adjust and correct the classification results of the backbone classifier. The experimental results showed that the Dice coefficient values of the proposed segmentation method in this article were 0.909 7, 0.877 3 and 0.839 6 for whole tumor, tumor core and enhanced tumor respectively, and the segmentation results had good boundary continuity in the three-dimensional direction. Therefore, the proposed semantic segmentation network has good segmentation performance for high-grade gliomas lesions.
Sujet(s)
Humains , Attention , Gliome/imagerie diagnostique , Imagerie par résonance magnétique/méthodes , SémantiqueRÉSUMÉ
Accurate segmentation of ground glass nodule (GGN) is important in clinical. But it is a tough work to segment the GGN, as the GGN in the computed tomography images show blur boundary, irregular shape, and uneven intensity. This paper aims to segment GGN by proposing a fully convolutional residual network, i.e., residual network based on atrous spatial pyramid pooling structure and attention mechanism (ResAANet). The network uses atrous spatial pyramid pooling (ASPP) structure to expand the feature map receptive field and extract more sufficient features, and utilizes attention mechanism, residual connection, long skip connection to fully retain sensitive features, which is extracted by the convolutional layer. First, we employ 565 GGN provided by Shanghai Chest Hospital to train and validate ResAANet, so as to obtain a stable model. Then, two groups of data selected from clinical examinations (84 GGN) and lung image database consortium (LIDC) dataset (145 GGN) were employed to validate and evaluate the performance of the proposed method. Finally, we apply the best threshold method to remove false positive regions and obtain optimized results. The average dice similarity coefficient (DSC) of the proposed algorithm on the clinical dataset and LIDC dataset reached 83.46%, 83.26% respectively, the average Jaccard index (IoU) reached 72.39%, 71.56% respectively, and the speed of segmentation reached 0.1 seconds per image. Comparing with other reported methods, our new method could segment GGN accurately, quickly and robustly. It could provide doctors with important information such as nodule size or density, which assist doctors in subsequent diagnosis and treatment.