Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 324
Filter
1.
J Med Imaging (Bellingham) ; 11(4): 044002, 2024 Jul.
Article in English | MEDLINE | ID: mdl-38988992

ABSTRACT

Purpose: Deep learning is the standard for medical image segmentation. However, it may encounter difficulties when the training set is small. Also, it may generate anatomically aberrant segmentations. Anatomical knowledge can be potentially useful as a constraint in deep learning segmentation methods. We propose a loss function based on projected pooling to introduce soft topological constraints. Our main application is the segmentation of the red nucleus from quantitative susceptibility mapping (QSM) which is of interest in parkinsonian syndromes. Approach: This new loss function introduces soft constraints on the topology by magnifying small parts of the structure to segment to avoid that they are discarded in the segmentation process. To that purpose, we use projection of the structure onto the three planes and then use a series of MaxPooling operations with increasing kernel sizes. These operations are performed both for the ground truth and the prediction and the difference is computed to obtain the loss function. As a result, it can reduce topological errors as well as defects in the structure boundary. The approach is easy to implement and computationally efficient. Results: When applied to the segmentation of the red nucleus from QSM data, the approach led to a very high accuracy (Dice 89.9%) and no topological errors. Moreover, the proposed loss function improved the Dice accuracy over the baseline when the training set was small. We also studied three tasks from the medical segmentation decathlon challenge (MSD) (heart, spleen, and hippocampus). For the MSD tasks, the Dice accuracies were similar for both approaches but the topological errors were reduced. Conclusions: We propose an effective method to automatically segment the red nucleus which is based on a new loss for introducing topology constraints in deep learning segmentation.

2.
Cancers (Basel) ; 16(13)2024 Jun 26.
Article in English | MEDLINE | ID: mdl-39001410

ABSTRACT

BACKGROUND: Bladder cancer (BC) segmentation on MRI images is the first step to determining the presence of muscular invasion. This study aimed to assess the tumor segmentation performance of three deep learning (DL) models on multi-parametric MRI (mp-MRI) images. METHODS: We studied 53 patients with bladder cancer. Bladder tumors were segmented on each slice of T2-weighted (T2WI), diffusion-weighted imaging/apparent diffusion coefficient (DWI/ADC), and T1-weighted contrast-enhanced (T1WI) images acquired at a 3Tesla MRI scanner. We trained Unet, MAnet, and PSPnet using three loss functions: cross-entropy (CE), dice similarity coefficient loss (DSC), and focal loss (FL). We evaluated the model performances using DSC, Hausdorff distance (HD), and expected calibration error (ECE). RESULTS: The MAnet algorithm with the CE+DSC loss function gave the highest DSC values on the ADC, T2WI, and T1WI images. PSPnet with CE+DSC obtained the smallest HDs on the ADC, T2WI, and T1WI images. The segmentation accuracy overall was better on the ADC and T1WI than on the T2WI. The ECEs were the smallest for PSPnet with FL on the ADC images, while they were the smallest for MAnet with CE+DSC on the T2WI and T1WI. CONCLUSIONS: Compared to Unet, MAnet and PSPnet with a hybrid CE+DSC loss function displayed better performances in BC segmentation depending on the choice of the evaluation metric.

3.
Sensors (Basel) ; 24(11)2024 May 26.
Article in English | MEDLINE | ID: mdl-38894221

ABSTRACT

Aiming at the problems of incomplete dehazing, color distortion, and loss of detail and edge information encountered by existing algorithms when processing images of underground coal mines, an image dehazing algorithm for underground coal mines, named CAB CA DSConv Fusion gUNet (CCDF-gUNet), is proposed. First, Dynamic Snake Convolution (DSConv) is introduced to replace traditional convolutions, enhancing the feature extraction capability. Second, residual attention convolution blocks are constructed to simultaneously focus on both local and global information in images. Additionally, the Coordinate Attention (CA) module is utilized to learn the coordinate information of features so that the model can better capture the key information in images. Furthermore, to simultaneously focus on the detail and structural consistency of images, a fusion loss function is introduced. Finally, based on the test verification of the public dataset Haze-4K, the Peak Signal-to-Noise Ratio (PSNR), Structural Similarity (SSIM), and Mean Squared Error (MSE) are 30.72 dB, 0.976, and 55.04, respectively, and on a self-made underground coal mine dataset, they are 31.18 dB, 0.971, and 49.66, respectively. The experimental results show that the algorithm performs well in dehazing, effectively avoids color distortion, and retains image details and edge information, providing some theoretical references for image processing in coal mine surveillance videos.

4.
Heliyon ; 10(9): e30821, 2024 May 15.
Article in English | MEDLINE | ID: mdl-38894726

ABSTRACT

Most accidents in a chemical process are caused by abnormal or deviations of the process parameters, and the existing research is focused on short-term prediction. When the early warning time is advanced, many false and missing alarms will occur in the system, which will cause certain problems for on-site personnel; how to ensure the accuracy of early warning as much as possible while the early warning time is a technical problem requiring an urgent solution. In the present work, a bidirectional long short-term memory network (BiLSTM) model was established according to the temporal variation characteristics of process parameters, and the Whale optimization algorithm (WOA) was used to optimize the model's hyperparameters automatically. The predicted value was further constructed as a Modified Inverted Normal Loss Function (MINLF), and the probability of abnormal fluctuations of process parameters was calculated using the residual time theory. Finally, the WOA-BiLSTM-MINLF process parameter prediction model with inherent risk and trend risk was established, and the fluctuation process of the process parameters was transformed into dynamic risk values. The results show that the prediction model alarms 16 min ahead of distributed control systems (DCS), which can reserve enough time for operators to take safety protection measures in advance and prevent accidents.

5.
Biomedicines ; 12(6)2024 Jun 13.
Article in English | MEDLINE | ID: mdl-38927516

ABSTRACT

This article addresses the semantic segmentation of laparoscopic surgery images, placing special emphasis on the segmentation of structures with a smaller number of observations. As a result of this study, adjustment parameters are proposed for deep neural network architectures, enabling a robust segmentation of all structures in the surgical scene. The U-Net architecture with five encoder-decoders (U-Net5ed), SegNet-VGG19, and DeepLabv3+ employing different backbones are implemented. Three main experiments are conducted, working with Rectified Linear Unit (ReLU), Gaussian Error Linear Unit (GELU), and Swish activation functions. The applied loss functions include Cross Entropy (CE), Focal Loss (FL), Tversky Loss (TL), Dice Loss (DiL), Cross Entropy Dice Loss (CEDL), and Cross Entropy Tversky Loss (CETL). The performance of Stochastic Gradient Descent with momentum (SGDM) and Adaptive Moment Estimation (Adam) optimizers is compared. It is qualitatively and quantitatively confirmed that DeepLabv3+ and U-Net5ed architectures yield the best results. The DeepLabv3+ architecture with the ResNet-50 backbone, Swish activation function, and CETL loss function reports a Mean Accuracy (MAcc) of 0.976 and Mean Intersection over Union (MIoU) of 0.977. The semantic segmentation of structures with a smaller number of observations, such as the hepatic vein, cystic duct, Liver Ligament, and blood, verifies that the obtained results are very competitive and promising compared to the consulted literature. The proposed selected parameters were validated in the YOLOv9 architecture, which showed an improvement in semantic segmentation compared to the results obtained with the original architecture.

6.
Sensors (Basel) ; 24(12)2024 Jun 10.
Article in English | MEDLINE | ID: mdl-38931551

ABSTRACT

A new algorithm, Yolov8n-FADS, has been proposed with the aim of improving the accuracy of miners' helmet detection algorithms in complex underground environments. By replacing the head part with Attentional Sequence Fusion (ASF) and introducing the P2 detection layer, the ASF-P2 structure is able to comprehensively extract the global and local feature information of the image, and the improvement in the backbone part is able to capture the spatially sparsely distributed features more efficiently, which improves the model's ability to perceive complex patterns. The improved detection head, SEAMHead by the SEAM module, can handle occlusion more effectively. The Focal Loss module can improve the model's ability to detect rare target categories by adjusting the weights of positive and negative samples. This study shows that compared with the original model, the improved model has 29% memory compression, a 36.7% reduction in the amount of parameters, and a 4.9% improvement in the detection accuracy, which can effectively improve the detection accuracy of underground helmet wearers, reduce the workload of underground video surveillance personnel, and improve the monitoring efficiency.

7.
Front Big Data ; 7: 1376023, 2024.
Article in English | MEDLINE | ID: mdl-38903951

ABSTRACT

Time series forecasting is an essential tool across numerous domains, yet traditional models often falter when faced with unilateral boundary conditions, where data is systematically overestimated or underestimated. This paper introduces a novel approach to the task of unilateral boundary time series forecasting. Our research bridges the gap in existing methods by proposing a specialized framework to accurately forecast within these skewed datasets. The cornerstone of our approach is the unilateral mean square error (UMSE), an asymmetric loss function that strategically addresses underestimation biases in training data, improving the precision of forecasts. We further enhance model performance through the implementation of a dual model structure that processes underestimated and accurately estimated data points separately, allowing for a nuanced analysis of the data trends. Additionally, feature reconstruction is employed to recapture obscured dynamics, ensuring a comprehensive understanding of the data. We demonstrate the effectiveness of our methods through extensive experimentation with LightGBM and GRU models across diverse datasets, showcasing superior accuracy and robustness in comparison to traditional models and existing methods. Our findings not only validate the efficacy of our approach but also reveal its model-independence and broad applicability. This work lays the groundwork for future research in this domain, opening new avenues for sophisticated analytical models in various industries where precise time series forecasting is crucial.

8.
Article in English | MEDLINE | ID: mdl-38822906

ABSTRACT

Long waiting time in outpatient departments is a crucial factor in patient dissatisfaction. We aim to analytically interpret the waiting times predicted by machine learning models and provide patients with an explanation of the expected waiting time. Here, underestimating waiting times can cause patient dissatisfaction, so preventing this in predictive models is necessary. To address this issue, we propose a framework considering dissatisfaction for estimating the waiting time in an outpatient department. In our framework, we leverage asymmetric loss functions to ensure robustness against underestimation. We also propose a dissatisfaction-aware asymmetric error score (DAES) to determine an appropriate model by considering the trade-off between underestimation and accuracy. Finally, Shapley additive explanation (SHAP) is applied to interpret the relationship trained by the model, enabling decision makers to use this information for improving outpatient service operations. We apply our framework in the endocrinology metabolism department and neurosurgery department in one of the largest hospitals in South Korea. The use of asymmetric functions prevents underestimation in the model, and with the proposed DAES, we can strike a balance in selecting the best model. By using SHAP, we can analytically interpret the waiting time in outpatient service (e.g., the length of the queue affects the waiting time the most) and provide explanations about the expected waiting time to patients. The proposed framework aids in improving operations, considering practical application in hospitals for real-time patient notification and minimizing patient dissatisfaction. Given the significance of managing hospital operations from the perspective of patients, this work is expected to contribute to operations improvement in health service practices.

9.
Sensors (Basel) ; 24(9)2024 May 05.
Article in English | MEDLINE | ID: mdl-38733038

ABSTRACT

With the continuous advancement of autonomous driving and monitoring technologies, there is increasing attention on non-intrusive target monitoring and recognition. This paper proposes an ArcFace SE-attention model-agnostic meta-learning approach (AS-MAML) by integrating attention mechanisms into residual networks for pedestrian gait recognition using frequency-modulated continuous-wave (FMCW) millimeter-wave radar through meta-learning. We enhance the feature extraction capability of the base network using channel attention mechanisms and integrate the additive angular margin loss function (ArcFace loss) into the inner loop of MAML to constrain inner loop optimization and improve radar discrimination. Then, this network is used to classify small-sample micro-Doppler images obtained from millimeter-wave radar as the data source for pose recognition. Experimental tests were conducted on pose estimation and image classification tasks. The results demonstrate significant detection and recognition performance, with an accuracy of 94.5%, accompanied by a 95% confidence interval. Additionally, on the open-source dataset DIAT-µRadHAR, which is specially processed to increase classification difficulty, the network achieves a classification accuracy of 85.9%.


Subject(s)
Pedestrians , Radar , Humans , Algorithms , Gait/physiology , Pattern Recognition, Automated/methods , Machine Learning
10.
Metabolites ; 14(5)2024 Apr 30.
Article in English | MEDLINE | ID: mdl-38786735

ABSTRACT

Accurate risk prediction for myocardial infarction (MI) is crucial for preventive strategies, given its significant impact on global mortality and morbidity. Here, we propose a novel deep-learning approach to enhance the prediction of incident MI cases by incorporating metabolomics alongside clinical risk factors. We utilized data from the KORA cohort, including the baseline S4 and follow-up F4 studies, consisting of 1454 participants without prior history of MI. The dataset comprised 19 clinical variables and 363 metabolites. Due to the imbalanced nature of the dataset (78 observed MI cases and 1376 non-MI individuals), we employed a generative adversarial network (GAN) model to generate new incident cases, augmenting the dataset and improving feature representation. To predict MI, we further utilized multi-layer perceptron (MLP) models in conjunction with the synthetic minority oversampling technique (SMOTE) and edited nearest neighbor (ENN) methods to address overfitting and underfitting issues, particularly when dealing with imbalanced datasets. To enhance prediction accuracy, we propose a novel GAN for feature-enhanced (GFE) loss function. The GFE loss function resulted in an approximate 2% improvement in prediction accuracy, yielding a final accuracy of 70%. Furthermore, we evaluated the contribution of each clinical variable and metabolite to the predictive model and identified the 10 most significant variables, including glucose tolerance, sex, and physical activity. This is the first study to construct a deep-learning approach for producing 7-year MI predictions using the newly proposed loss function. Our findings demonstrate the promising potential of our technique in identifying novel biomarkers for MI prediction.

11.
Comput Biol Med ; 176: 108606, 2024 Jun.
Article in English | MEDLINE | ID: mdl-38763068

ABSTRACT

This paper presents a deep learning method using Natural Language Processing (NLP) techniques, to distinguish between Mild Cognitive Impairment (MCI) and Normal Cognitive (NC) conditions in older adults. We propose a framework that analyzes transcripts generated from video interviews collected within the I-CONECT study project, a randomized controlled trial aimed at improving cognitive functions through video chats. Our proposed NLP framework consists of two Transformer-based modules, namely Sentence Embedding (SE) and Sentence Cross Attention (SCA). First, the SE module captures contextual relationships between words within each sentence. Subsequently, the SCA module extracts temporal features from a sequence of sentences. This feature is then used by a Multi-Layer Perceptron (MLP) for the classification of subjects into MCI or NC. To build a robust model, we propose a novel loss function, called InfoLoss, that considers the reduction in entropy by observing each sequence of sentences to ultimately enhance the classification accuracy. The results of our comprehensive model evaluation using the I-CONECT dataset show that our framework can distinguish between MCI and NC with an average area under the curve of 84.75%.


Subject(s)
Cognitive Dysfunction , Natural Language Processing , Humans , Cognitive Dysfunction/diagnosis , Aged , Female , Deep Learning , Male , Linguistics
12.
Bioengineering (Basel) ; 11(5)2024 Apr 26.
Article in English | MEDLINE | ID: mdl-38790294

ABSTRACT

Brain tissue segmentation plays a critical role in the diagnosis, treatment, and study of brain diseases. Accurately identifying these boundaries is essential for improving segmentation accuracy. However, distinguishing boundaries between different brain tissues can be challenging, as they often overlap. Existing deep learning methods primarily calculate the overall segmentation results without adequately addressing local regions, leading to error propagation and mis-segmentation along boundaries. In this study, we propose a novel mis-segmentation-focused loss function based on a two-stage nnU-Net framework. Our approach aims to enhance the model's ability to handle ambiguous boundaries and overlapping anatomical structures, thereby achieving more accurate brain tissue segmentation results. Specifically, the first stage targets the identification of mis-segmentation regions using a global loss function, while the second stage involves defining a mis-segmentation loss function to adaptively adjust the model, thus improving its capability to handle ambiguous boundaries and overlapping anatomical structures. Experimental evaluations on two datasets demonstrate that our proposed method outperforms existing approaches both quantitatively and qualitatively.

13.
Heliyon ; 10(10): e30993, 2024 May 30.
Article in English | MEDLINE | ID: mdl-38779030

ABSTRACT

The determination of the areas where the solar power plant will be installed is of great importance for the performance of the solar power plant. Solar and hydroelectric energy are the most widely used renewable energy sources in Kars province. Site selection for these power plants is an important factor in terms of reducing the installation cost of the solar power plant and achieving maximum efficiency during operation. Determining the areas where the power plants will be installed is a very complex and difficult to analyse spatial decision making problem. In this study, firstly GIS is used as a mapping method to obtain the locations of both solar power plants in Susuz, Arpaçay, Akkaya, Kars city centre, Selim, Digor, Kagizman and Sarikamiș districts of Kars province and then Taguchi loss function based interval type-2 fuzzy approach is applied to the problem. In order to obtain more accurate results, the results of the two methods (GIS and Taguchi loss function based interval type-2 fuzzy approach) were also compared. According to the solar power plant map obtained, it was determined that the total area of suitable areas is 78600 km2.

14.
BMC Genomics ; 25(1): 406, 2024 May 09.
Article in English | MEDLINE | ID: mdl-38724906

ABSTRACT

Most proteins exert their functions by interacting with other proteins, making the identification of protein-protein interactions (PPI) crucial for understanding biological activities, pathological mechanisms, and clinical therapies. Developing effective and reliable computational methods for predicting PPI can significantly reduce the time-consuming and labor-intensive associated traditional biological experiments. However, accurately identifying the specific categories of protein-protein interactions and improving the prediction accuracy of the computational methods remain dual challenges. To tackle these challenges, we proposed a novel graph neural network method called GNNGL-PPI for multi-category prediction of PPI based on global graphs and local subgraphs. GNNGL-PPI consisted of two main components: using Graph Isomorphism Network (GIN) to extract global graph features from PPI network graph, and employing GIN As Kernel (GIN-AK) to extract local subgraph features from the subgraphs of protein vertices. Additionally, considering the imbalanced distribution of samples in each category within the benchmark datasets, we introduced an Asymmetric Loss (ASL) function to further enhance the predictive performance of the method. Through evaluations on six benchmark test sets formed by three different dataset partitioning algorithms (Random, BFS, DFS), GNNGL-PPI outperformed the state-of-the-art multi-category prediction methods of PPI, as measured by the comprehensive performance evaluation metric F1-measure. Furthermore, interpretability analysis confirmed the effectiveness of GNNGL-PPI as a reliable multi-category prediction method for predicting protein-protein interactions.


Subject(s)
Algorithms , Computational Biology , Neural Networks, Computer , Protein Interaction Mapping , Protein Interaction Mapping/methods , Computational Biology/methods , Protein Interaction Maps , Humans , Proteins/metabolism
15.
Heliyon ; 10(7): e28538, 2024 Apr 15.
Article in English | MEDLINE | ID: mdl-38571625

ABSTRACT

Liver tumors are one of the most aggressive malignancies in the human body. Computer-aided technology and liver interventional surgery are effective in the prediction, identification and management of liver neoplasms. One of the important processes is to accurately grasp the morphological structure of the liver and liver blood vessels. However, accurate identification and segmentation of hepatic blood vessels in CT images poses a formidable challenge. Manually locating and segmenting liver vessels in CT images is time-consuming and impractical. There is an imperative clinical requirement for a precise and effective algorithm to segment liver vessels. In response to this demand, the current paper advocates a liver vessel segmentation approach that employs an enhanced 3D fully convolutional neural network V-Net. The network model improves the basic network structure according to the characteristics of liver vessels. First, a pyramidal convolution block is introduced between the encoder and decoder of the network to improve the network localization ability. Then, multi-resolution deep supervision is introduced in the network, resulting in more robust segmentation. Finally, by fusing feature maps of different resolutions, the overall segmentation result is predicted. Evaluation experiments on public datasets demonstrate that our improved scheme can increase the segmentation ability of existing network models for liver vessels. Compared with the existing work, the experimental outcomes demonstrate that the technique presented in this manuscript has attained superior performance on the Dice Coefficient index, which can promote the treatment of liver tumors.

16.
BMC Bioinformatics ; 25(1): 169, 2024 Apr 29.
Article in English | MEDLINE | ID: mdl-38684942

ABSTRACT

Many important biological facts have been found as single-cell RNA sequencing (scRNA-seq) technology has advanced. With the use of this technology, it is now possible to investigate the connections among individual cells, genes, and illnesses. For the analysis of single-cell data, clustering is frequently used. Nevertheless, biological data usually contain a large amount of noise data, and traditional clustering methods are sensitive to noise. However, acquiring higher-order spatial information from the data alone is insufficient. As a result, getting trustworthy clustering findings is challenging. We propose the Cauchy hyper-graph Laplacian non-negative matrix factorization (CHLNMF) as a unique approach to address these issues. In CHLNMF, we replace the measurement based on Euclidean distance in the conventional non-negative matrix factorization (NMF), which can lessen the influence of noise, with the Cauchy loss function (CLF). The model also incorporates the hyper-graph constraint, which takes into account the high-order link among the samples. The CHLNMF model's best solution is then discovered using a half-quadratic optimization approach. Finally, using seven scRNA-seq datasets, we contrast the CHLNMF technique with the other nine top methods. The validity of our technique was established by analysis of the experimental outcomes.


Subject(s)
Algorithms , Sequence Analysis, RNA , Single-Cell Analysis , Single-Cell Analysis/methods , Sequence Analysis, RNA/methods , Humans , Cluster Analysis , Computational Biology/methods
17.
Comput Biol Med ; 173: 108373, 2024 May.
Article in English | MEDLINE | ID: mdl-38564851

ABSTRACT

Segmentation of the temporomandibular joint (TMJ) disc and condyle from magnetic resonance imaging (MRI) is a crucial task in TMJ internal derangement research. The automatic segmentation of the disc structure presents challenges due to its intricate and variable shapes, low contrast, and unclear boundaries. Existing TMJ segmentation methods often overlook spatial and channel information in features and neglect overall topological considerations, with few studies exploring the interaction between segmentation and topology preservation. To address these challenges, we propose a Three-Branch Jointed Feature and Topology Decoder (TFTD) for the segmentation of TMJ disc and condyle in MRI. This structure effectively preserves the topological information of the disc structure and enhances features. We introduce a cross-dimensional spatial and channel attention mechanism (SCIA) to enhance features. This mechanism captures spatial, channel, and cross-dimensional information of the decoded features, leading to improved segmentation performance. Moreover, we explore the interaction between topology preservation and segmentation from the perspective of game theory. Based on this interaction, we design the Joint Loss Function (JLF) to fully leverage the features of segmentation, topology preservation, and joint interaction branches. Results on the TMJ MRI dataset demonstrate the superior performance of our TFTD compared to existing methods.


Subject(s)
Temporomandibular Joint Disorders , Temporomandibular Joint , Humans , Temporomandibular Joint/diagnostic imaging , Temporomandibular Joint/pathology , Temporomandibular Joint Disc/pathology , Temporomandibular Joint Disorders/diagnostic imaging , Temporomandibular Joint Disorders/pathology , Magnetic Resonance Imaging/methods , Movement
18.
Comput Biol Med ; 173: 108381, 2024 May.
Article in English | MEDLINE | ID: mdl-38569237

ABSTRACT

Multimodal medical image fusion (MMIF) technology plays a crucial role in medical diagnosis and treatment by integrating different images to obtain fusion images with comprehensive information. Deep learning-based fusion methods have demonstrated superior performance, but some of them still encounter challenges such as imbalanced retention of color and texture information and low fusion efficiency. To alleviate the above issues, this paper presents a real-time MMIF method, called a lightweight residual fusion network. First, a feature extraction framework with three branches is designed. Two independent branches are used to fully extract brightness and texture information. The fusion branch enables different modal information to be interactively fused at a shallow level, thereby better retaining brightness and texture information. Furthermore, a lightweight residual unit is designed to replace the conventional residual convolution in the model, thereby improving the fusion efficiency and reducing the overall model size by approximately 5 times. Finally, considering that the high-frequency image decomposed by the wavelet transform contains abundant edge and texture information, an adaptive strategy is proposed for assigning weights to the loss function based on the information content in the high-frequency image. This strategy effectively guides the model toward preserving intricate details. The experimental results on MRI and functional images demonstrate that the proposed method exhibits superior fusion performance and efficiency compared to alternative approaches. The code of LRFNet is available at https://github.com/HeDan-11/LRFNet.


Subject(s)
Image Processing, Computer-Assisted , Wavelet Analysis
19.
Physiol Meas ; 45(5)2024 May 07.
Article in English | MEDLINE | ID: mdl-38604181

ABSTRACT

Objective. Monitoring changes in human heart rate variability (HRV) holds significant importance for protecting life and health. Studies have shown that Imaging Photoplethysmography (IPPG) based on ordinary color cameras can detect the color change of the skin pixel caused by cardiopulmonary system. Most researchers employed deep learning IPPG algorithms to extract the blood volume pulse (BVP) signal, analyzing it predominantly through the heart rate (HR). However, this approach often overlooks the inherent intricate time-frequency domain characteristics in the BVP signal, which cannot be comprehensively deduced solely from HR. The analysis of HRV metrics through the BVP signal is imperative. APPROACH: In this paper, the transformation invariant loss function with distance equilibrium (TIDLE) loss function is applied to IPPG for the first time, and the details of BVP signal can be recovered better. In detail, TIDLE is tested in four commonly used IPPG deep learning models, which are DeepPhys, EfficientPhys, Physnet and TS_CAN, and compared with other three loss functions, which are mean absolute error (MAE), mean square error (MSE), Neg Pearson Coefficient correlation (NPCC). MAIN RESULTS: The experiments demonstrate that MAE and MSE exhibit suboptimal performance in predicting LF/HF across the four models, achieving the Statistic of Mean Absolute Error (MAES) of 25.94% and 34.05%, respectively. In contrast, NPCC and TIDLE yielded more favorable results at 13.51% and 11.35%, respectively. Taking into consideration the morphological characteristics of the BVP signal, on the two optimal models for predicting HRV metrics, namely DeepPhys and TS_CAN, the Pearson coefficients for the BVP signals predicted by TIDLE in comparison to the gold-standard BVP signals achieved values of 0.627 and 0.605, respectively. In contrast, the results based on NPCC were notably lower, at only 0.545 and 0.533, respectively. SIGNIFICANCE: This paper contributes significantly to the effective restoration of the morphology and frequency domain characteristics of the BVP signal.


Subject(s)
Photoplethysmography , Signal Processing, Computer-Assisted , Photoplethysmography/methods , Humans , Deep Learning , Heart Rate/physiology , Algorithms , Image Processing, Computer-Assisted/methods
20.
Front Plant Sci ; 15: 1338228, 2024.
Article in English | MEDLINE | ID: mdl-38606066

ABSTRACT

The accurate identification of maize crop row navigation lines is crucial for the navigation of intelligent weeding machinery, yet it faces significant challenges due to lighting variations and complex environments. This study proposes an optimized version of the YOLOX-Tiny single-stage detection network model for accurately identifying maize crop row navigation lines. It incorporates adaptive illumination adjustment and multi-scale prediction to enhance dense target detection. Visual attention mechanisms, including Efficient Channel Attention and Cooperative Attention modules, are introduced to better extract maize features. A Fast Spatial Pyramid Pooling module is incorporated to improve target localization accuracy. The Coordinate Intersection over Union loss function is used to further enhance detection accuracy. Experimental results demonstrate that the improved YOLOX-Tiny model achieves an average precision of 92.2 %, with a detection time of 15.6 milliseconds. This represents a 16.4 % improvement over the original model while maintaining high accuracy. The proposed model has a reduced size of 18.6 MB, representing a 7.1 % reduction. It also incorporates the least squares method for accurately fitting crop rows. The model showcases efficiency in processing large amounts of data, achieving a comprehensive fitting time of 42 milliseconds and an average angular error of 0.59°. The improved YOLOX-Tiny model offers substantial support for the navigation of intelligent weeding machinery in practical applications, contributing to increased agricultural productivity and reduced usage of chemical herbicides.

SELECTION OF CITATIONS
SEARCH DETAIL
...