Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 26
Filter
1.
MethodsX ; 12: 102692, 2024 Jun.
Article in English | MEDLINE | ID: mdl-38638453

ABSTRACT

With the medical condition of pneumothorax, also known as collapsed lung, air builds up in the pleural cavity and causes the lung to collapse. It is a critical disorder that needs to be identified and treated right as it can cause breathing difficulties, low blood oxygen levels, and, in extreme circumstances, death. Chest X-rays are frequently used to diagnose pneumothorax. Using the Mask R-CNN model and medical transfer learning, the proposed work offers•A novel method for pneumothorax segmentation from chest X-rays.•A method that takes advantage of the Mask R-CNN architecture's for object recognition and segmentation.•A modified model to address the issue of segmenting pneumothoraxes and then polish it using a sizable dataset of chest X-rays. The proposed method is tested against other pneumothorax segmentation techniques using a dataset of 'chest X-rays' with 'pneumothorax annotations. The test findings demonstrate that proposed method outperforms other cutting-edge techniques in terms of segmentation accuracy and speed. The proposed method could lead to better patient outcomes by increasing the precision and effectiveness of pneumothorax diagnosis and therapy. Proposed method also benefits other medical imaging activities by using the medical transfer learning approaches which increases the precision of computer-aided diagnosis and treatment planning.

2.
Heliyon ; 10(4): e25298, 2024 Feb 29.
Article in English | MEDLINE | ID: mdl-38370222

ABSTRACT

-Equipping lithium-ion batteries with a reasonable thermal fault diagnosis can avoid thermal runaway and ensure the safe and reliable operation of the batteries. This research built a lithium-ion battery thermal fault diagnosis model that optimized the original mask region-based convolutional neural network based on the battery dataset in both parameters and structure. The model processes the thermal images of the battery surface, identifies problematic batteries, and locates the problematic regions. A backbone network is used to process the battery thermal images and extract feature information. Through the RPN network, the thermal feature is classified and regressed, and the Mask branch is used to ultimately determine the faulty battery's location. Additionally, we have optimized the original mask region-based convolutional neural network based on the battery dataset in both parameters and structure. The improved LBIP-V2 performs better than LBIP-V1 in most cases. We tested the performance of LBIP on the single-cell battery dataset, the 1P3S battery pack dataset, and the flattened 1P3S battery pack dataset. The results show that the recognition accuracy of LBIP exceeded 95 %. At the same time, we simulated the failure of the 1P3S battery pack within 0-15 min and tested the effectiveness of LBIP in real-time battery fault diagnosis. The results indicate that LBIP can effectively respond to online faults with a confidence level of over 98 %.

3.
Vis Comput Ind Biomed Art ; 7(1): 3, 2024 Feb 01.
Article in English | MEDLINE | ID: mdl-38296864

ABSTRACT

Alzheimer's disease (AD) is a neurological disorder that predominantly affects the brain. In the coming years, it is expected to spread rapidly, with limited progress in diagnostic techniques. Various machine learning (ML) and artificial intelligence (AI) algorithms have been employed to detect AD using single-modality data. However, recent developments in ML have enabled the application of these methods to multiple data sources and input modalities for AD prediction. In this study, we developed a framework that utilizes multimodal data (tabular data, magnetic resonance imaging (MRI) images, and genetic information) to classify AD. As part of the pre-processing phase, we generated a knowledge graph from the tabular data and MRI images. We employed graph neural networks for knowledge graph creation, and region-based convolutional neural network approach for image-to-knowledge graph generation. Additionally, we integrated various explainable AI (XAI) techniques to interpret and elucidate the prediction outcomes derived from multimodal data. Layer-wise relevance propagation was used to explain the layer-wise outcomes in the MRI images. We also incorporated submodular pick local interpretable model-agnostic explanations to interpret the decision-making process based on the tabular data provided. Genetic expression values play a crucial role in AD analysis. We used a graphical gene tree to identify genes associated with the disease. Moreover, a dashboard was designed to display XAI outcomes, enabling experts and medical professionals to easily comprehend the prediction results.

4.
Phys Med Biol ; 68(19)2023 09 26.
Article in English | MEDLINE | ID: mdl-37678268

ABSTRACT

Objective.In clinical medicine, localization and identification of disease on spinal radiographs are difficult and require a high level of expertise in the radiological discipline and extensive clinical experience. The model based on deep learning acquires certain disease recognition abilities through continuous training, thereby assisting clinical physicians in disease diagnosis. This study aims to develop an object detection network that accurately locates and classifies the abnormal parts in spinal x-ray photographs.Approach.This study proposes a deep learning-based automated multi-disease detection architecture called Abnormality Capture-Faster Region-based Convolutional Neural Network (AC-Faster R-CNN), which develops the feature fusion structure Deformable Convolution Feature Pyramid Network and the abnormality capture structure Abnormality Capture Head. Through the combination of dilated and deformable convolutions, the model better captures the multi-scale information of lesions. To further improve the detection performance, the contrast enhancement algorithm Contrast Limited Adaptive Histogram Equalization is used for image preprocessing.Main results.The proposed model is extensively evaluated on a testing set containing 1007 spine x-ray images and the experimental results show that the AC-Faster R-CNN architecture outperforms the baseline model and other advanced detection architectures. The mean Average Precision at Intersection over Union of 50% are 39.8%, the Precision and Sensitivity at the optimal cutoff point of Precision-Recall curve are 48.6% and 46.3%, respectively, reaching the current state-of-the-art detection level.Significance.AC-Faster R-CNN exhibits high precision and sensitivity in abnormality detection tasks of spinal x-ray images, and effectively locates and identifies abnormal areas. Additionally, this study would provide reference and comparison for the further development of medical automatic detection.


Subject(s)
Neural Networks, Computer , Radiology , X-Rays , Radiography , Algorithms
5.
Philos Trans A Math Phys Eng Sci ; 381(2254): 20220169, 2023 Sep 04.
Article in English | MEDLINE | ID: mdl-37454685

ABSTRACT

The current study aims to improve the efficiency of automatic identification of pavement distress and improve the status quo of difficult identification and detection of pavement distress. First, the identification method of pavement distress and the types of pavement distress are analysed. Then, the design concept of deep learning in pavement distress recognition is described. Finally, the mask region-based convolutional neural network (Mask R-CNN) model is designed and applied in the recognition of road crack distress. The results show that in the evaluation of the model's comprehensive recognition performance, the highest accuracy is 99%, and the lowest accuracy is 95% after the test and evaluation of the designed model in different datasets. In the evaluation of different crack identification and detection methods, the highest accuracy of transverse crack detection is 98% and the lowest accuracy is 95%. In longitudinal crack detection, the highest accuracy is 98% and the lowest accuracy is 92%. In mesh crack detection, the highest accuracy is 98% and the lowest accuracy is 92%. This work not only provides an in-depth reference for the application of deep CNNs in pavement distress recognition but also promotes the improvement of road traffic conditions, thus contributing to the progression of smart cities in the future. This article is part of the theme issue 'Artificial intelligence in failure analysis of transportation infrastructure and materials'.

6.
Environ Monit Assess ; 195(4): 462, 2023 Mar 13.
Article in English | MEDLINE | ID: mdl-36907939

ABSTRACT

Economic development, population growth, and rapid urbanization are the reasons for an increasing generation of waste all over the world. Recently, the statistics showed that 2.1 million of tons municipal solid waste (MSW) was produced in 2016, which is estimated to enhance by 3.4 million tons in 2050. In recent times, municipal solid waste generation is dramatically increasing due to factors such as rapid urbanization, altered living standard, and increased population. These factors make the municipal solid waste management system complex and break the pollution-controlling strategies. So it necessitates the system to accurately predict the waste composition. Based on the waste classification, a suitable decomposition technique is preferred. Therefore, this paper proposed CMSOA optimized dual faster R-CNN based waste management system to accurately classify the waste composition. The proposed system is formed by hybridizing dual faster RCNN along with complex-valued encoding multi-chain seeker optimization algorithm (CMSOA). Various evaluation measures, namely accuracy, precision, recall, F-measure, RMSE, MAE, and MAPE metrics, are computed, and the case study analysis is conducted on the major five cities of Maharashtra. The comparative analysis is carried out for various approaches, and from the analysis, the results revealed that the proposed method provides better classification results than other methods.


Subject(s)
Refuse Disposal , Waste Management , Solid Waste , Environmental Monitoring , India , Waste Management/methods , Cities
7.
Behav Res Methods ; 55(3): 1372-1391, 2023 04.
Article in English | MEDLINE | ID: mdl-35650384

ABSTRACT

With continued advancements in portable eye-tracker technology liberating experimenters from the restraints of artificial laboratory designs, research can now collect gaze data from real-world, natural navigation. However, the field lacks a robust method for achieving this, as past approaches relied upon the time-consuming manual annotation of eye-tracking data, while previous attempts at automation lack the necessary versatility for in-the-wild navigation trials consisting of complex and dynamic scenes. Here, we propose a system capable of informing researchers of where and what a user's gaze is focused upon at any one time. The system achieves this by first running footage recorded on a head-mounted camera through a deep-learning-based object detection algorithm called Masked Region-based Convolutional Neural Network (Mask R-CNN). The algorithm's output is combined with frame-by-frame gaze coordinates measured by an eye-tracking device synchronized with the head-mounted camera to detect and annotate, without any manual intervention, what a user looked at for each frame of the provided footage. The effectiveness of the presented methodology was legitimized by a comparison between the system output and that of manual coders. High levels of agreement between the two validated the system as a preferable data collection technique as it was capable of processing data at a significantly faster rate than its human counterpart. Support for the system's practicality was then further demonstrated via a case study exploring the mediatory effects of gaze behaviors on an environment-driven attentional bias.


Subject(s)
Deep Learning , Eye Movements , Humans , Eye-Tracking Technology , Neural Networks, Computer , Algorithms
8.
Article in Chinese | WPRIM (Western Pacific) | ID: wpr-1038393

ABSTRACT

Objective @#To develop an endoscopic automatic detection system in early gastric cancer (EGC) based on a region-based convolutional neural network ( Mask R-CNN) .@*Methods @# A total of 3 579 and 892 white light images (WLI) of EGC were obtained from the First Affiliated Hospital of Anhui Medical University for training and testing,respectively.Then,10 WLI videos were obtained prospectively to test dynamic performance of the RCNN system.In addition,400 WLI images were randomly selected for comparison with the Mask R-CNN system and endoscopists.Diagnostic ability was assessed by accuracy,sensitivity,specificity,positive predictive value ( PPV) , and negative predictive value (NPV) . @*Results @# The accuracy,sensitivity and specificity of the Mask R-CNN system in diagnosing EGC in WLI images were 90. 25% ,91. 06% and 89. 01% ,respectively,and there was no significant statistical difference with the results of pathological diagnosis.Among WLI real-time videos,the diagnostic accuracy was 90. 27%.The speed of test videos was up to 35 frames / s in real time.In the controlled experiment, the sensitivity of Maks R-CNN system was higher than that of the experts (93. 00% vs 80. 20% ,χ2 = 7. 059,P < 0. 001) ,and the specificity was higher than that of the juniors (82. 67% vs 71. 87% ,χ2 = 9. 955,P<0. 001) , and the overall accuracy rate was higher than that of the seniors (85. 25% vs 78. 00% ,χ2 = 7. 009,P<0. 001) . @*Conclusion@#The Mask R-CNN system has excellent performance for detection of EGC under WLI,which has great potential for practical clinical application.

9.
Front Physiol ; 14: 1324042, 2023.
Article in English | MEDLINE | ID: mdl-38292449

ABSTRACT

Introduction: Melanoma Skin Cancer (MSC) is a type of cancer in the human body; therefore, early disease diagnosis is essential for reducing the mortality rate. However, dermoscopic image analysis poses challenges due to factors such as color illumination, light reflections, and the varying sizes and shapes of lesions. To overcome these challenges, an automated framework is proposed in this manuscript. Methods: Initially, dermoscopic images are acquired from two online benchmark datasets: International Skin Imaging Collaboration (ISIC) 2020 and Human against Machine (HAM) 10000. Subsequently, a normalization technique is employed on the dermoscopic images to decrease noise impact, outliers, and variations in the pixels. Furthermore, cancerous regions in the pre-processed images are segmented utilizing the mask-faster Region based Convolutional Neural Network (RCNN) model. The mask-RCNN model offers precise pixellevel segmentation by accurately delineating object boundaries. From the partitioned cancerous regions, discriminative feature vectors are extracted by applying three pre-trained CNN models, namely ResNeXt101, Xception, and InceptionV3. These feature vectors are passed into the modified Gated Recurrent Unit (GRU) model for MSC classification. In the modified GRU model, a swish-Rectified Linear Unit (ReLU) activation function is incorporated that efficiently stabilizes the learning process with better convergence rate during training. Results and discussion: The empirical investigation demonstrate that the modified GRU model attained an accuracy of 99.95% and 99.98% on the ISIC 2020 and HAM 10000 datasets, where the obtained results surpass the conventional detection models.

10.
Sensors (Basel) ; 22(22)2022 Nov 11.
Article in English | MEDLINE | ID: mdl-36433308

ABSTRACT

This paper proposes the implementation of and experimentation with GPR for real-time automatic detection of buried IEDs. GPR, consisting of hardware and software, was implemented. A UWB antenna was designed and implemented, particularly for the operation of the GPR. The experiments were conducted in order to demonstrate the real-time automatic detection of buried IEDs using GPR with an R-CNN algorithm. In the experiments, the GPR was mounted on a pickup truck and a maintenance train in order to find the IEDs buried under a road and a railway, respectively. B-scan images were collected using the implemented GPR. R-CNN-based detection for the hyperbolic pattern, which indicates the buried IED, was performed along with pre-processing, for example, using zero offset removal, and background removal and filtering. Experimental results in terms of detecting the hyperbolic pattern in B-scan images were shown and verified that the proposed GPR system is superior to the conventional one using region analysis processing-based detection. Results also showed that pre-processing is required in order to improve and/or clean the hyperbolic pattern before detection. The GPR can automatically detect IEDs buried under roads and railways in real time by detecting the hyperbolic pattern appearing in the collected B-scan image.


Subject(s)
Radar , Weapons , Empirical Research , Research Design , Algorithms
11.
Front Plant Sci ; 13: 962391, 2022.
Article in English | MEDLINE | ID: mdl-36035663

ABSTRACT

Tea is one of the most common beverages in the world. In order to reduce the cost of artificial tea picking and improve the competitiveness of tea production, this paper proposes a new model, termed the Mask R-CNN Positioning of Picking Point for Tea Shoots (MR3P-TS) model, for the identification of the contour of each tea shoot and the location of picking points. In this study, a dataset of tender tea shoot images taken in a real, complex scene was constructed. Subsequently, an improved Mask R-CNN model (the MR3P-TS model) was built that extended the mask branch in the network design. By calculating the area of multiple connected domains of the mask, the main part of the shoot was identified. Then, the minimum circumscribed rectangle of the main part is calculated to determine the tea shoot axis, and to finally obtain the position coordinates of the picking point. The MR3P-TS model proposed in this paper achieved an mAP of 0.449 and an F2 value of 0.313 in shoot identification, and achieved a precision of 0.949 and a recall of 0.910 in the localization of the picking points. Compared with the mainstream object detection algorithms YOLOv3 and Faster R-CNN, the MR3P-TS algorithm had a good recognition effect on the overlapping shoots in an unstructured environment, which was stronger in both versatility and robustness. The proposed method can accurately detect and segment tea bud regions in real complex scenes at the pixel level, and provide precise location coordinates of suggested picking points, which should support the further development of automated tea picking machines.

12.
Inform Med Unlocked ; 32: 101025, 2022.
Article in English | MEDLINE | ID: mdl-35873921

ABSTRACT

A new artificial intelligence (AI) supported T-Ray imaging system designed and implemented for non-invasive and non-ionizing screening for coronavirus-affected patients. The new system has the potential to replace the standard conventional X-Ray based imaging modality of virus detection. This research article reports the development of solid state room temperature terahertz source for thermograph study. Exposure time and radiation energy are optimized through several real-time experiments. During its incubation period, Coronavirus stays within the cell of the upper respiratory tract and its presence often causes an increased level of blood supply to the virus-affected cells/inter-cellular region that results in a localized increase of water content in those cells & tissues in comparison to its neighbouring normal cells. Under THz-radiation exposure, the incident energy gets absorbed more in virus-affected cells/inter-cellular region and gets heated; thus, the sharp temperature gradient is observed in the corresponding thermograph study. Additionally, structural changes in virus-affected zones make a significant contribution in getting better contrast in thermographs. Considering the effectiveness of the Artificial Intelligence (AI) analysis tool in various medical diagnoses, the authors have employed an explainable AI-assisted methodology to correctly identify and mark the affected pulmonary region for the developed imaging technique and thus validate the model. This AI-enabled non-ionizing THz-thermography method is expected to address the voids in early COVID diagnosis, at the onset of infection.

13.
Diagnostics (Basel) ; 12(6)2022 Jun 16.
Article in English | MEDLINE | ID: mdl-35741291

ABSTRACT

Although drug-induced liver injury (DILI) is a major target of the pharmaceutical industry, we currently lack an efficient model for evaluating liver toxicity in the early stage of its development. Recent progress in artificial intelligence-based deep learning technology promises to improve the accuracy and robustness of current toxicity prediction models. Mask region-based CNN (Mask R-CNN) is a detection-based segmentation model that has been used for developing algorithms. In the present study, we applied a Mask R-CNN algorithm to detect and predict acute hepatic injury lesions induced by acetaminophen (APAP) in Sprague-Dawley rats. To accomplish this, we trained, validated, and tested the model for various hepatic lesions, including necrosis, inflammation, infiltration, and portal triad. We confirmed the model performance at the whole-slide image (WSI) level. The training, validating, and testing processes, which were performed using tile images, yielded an overall model accuracy of 96.44%. For confirmation, we compared the model's predictions for 25 WSIs at 20× magnification with annotated lesion areas determined by an accredited toxicologic pathologist. In individual WSIs, the expert-annotated lesion areas of necrosis, inflammation, and infiltration tended to be comparable with the values predicted by the algorithm. The overall predictions showed a high correlation with the annotated area. The R square values were 0.9953, 0.9610, and 0.9445 for necrosis, inflammation plus infiltration, and portal triad, respectively. The present study shows that the Mask R-CNN algorithm is a useful tool for detecting and predicting hepatic lesions in non-clinical studies. This new algorithm might be widely useful for predicting liver lesions in non-clinical and clinical settings.

14.
Ann Transl Med ; 10(10): 546, 2022 May.
Article in English | MEDLINE | ID: mdl-35722438

ABSTRACT

Background: Laparoscopic surgery has been in great demand over the past decades; it has also brought several obstacles, such as increasing difficulty in maintaining hemostasis, changes in surgical approach, and reduced field of vision. Locating the bleeding point can help surgeons to control bleeding quickly, however, to date, there have been no tools designed for automatic bleeding tracking in laparoscopic operations. Herein, we have proposed a spatiotemporal hybrid model based on a faster region-based convolutional neural network (RCNN) for bleeding point detection in laparoscopic surgery videos. Methods: Laparoscopic videos performed at our hospital were retrieved and images containing bleeding events were extracted. Spatiotemporal features were extracted by using red-green-blue (RGB) frames and optical flow maps and a spatiotemporal hybrid model was developed based on the faster RCNN. The proposed model contributed to (I) providing real-time bleeding point detection which directly assist surgeons, (II) showing the blood's optical flow which improved bleeding point detection, and (III) detecting both arterial and venous bleeding. Results: In this study, 12 different bleeding videos were included for deep learning model training. Compared with models containing a single RGB or a single optical flow map, our model combining RGB and optical flow achieved great detection results (precision rate of 0.8373, recall rate of 0.8034, and average precision of 0.6818). Conclusions: Our approach performs well in bleeding point location and recognition, indicating its potential value in helping to maintain and re-establish hemostasis during operations.

15.
Comput Struct Biotechnol J ; 20: 2372-2380, 2022.
Article in English | MEDLINE | ID: mdl-35664223

ABSTRACT

Poor efficacy of some anthelmintics and rising concerns about the widespread drug resistance have highlighted the need for new drug discovery. The parasitic nematode Haemonchus contortus is an important model organism widely used for studies of drug resistance and drug screening with the current gold standard being the motility assay. We applied a deep learning approach Mask R-CNN for analysing motility videos containing varying rates of motile worms and compared it to other commonly used algorithms with different levels of complexity, namely the Wiggle Index and the Wide Field-of-View Nematode Tracking Platform. Mask R-CNN consistently outperformed the other algorithms in terms of the detection of worms as well as the precision of motility forecasts, having a mean absolute percentage error of 7.6% and a mean absolute error of 5.6% for the detection and motility forecasts, respectively. Using Mask R-CNN for motility assays confirmed the common problem with algorithms that use non-maximum suppression in detecting overlapping objects, which negatively impacts the overall precision. The use of intersect over union as a measure of the classification of motile / non-motile instances had an overall accuracy of 89%, indicating that it is a viable alternative to previously used methods based on movement characteristics, such as body bends. In comparison to the existing methods evaluated here, Mask R-CNN performed better and we anticipate that this method will broaden the number of possible approaches to video analysis of worm motility.

16.
Int Ophthalmol ; 42(10): 3061-3070, 2022 Oct.
Article in English | MEDLINE | ID: mdl-35381895

ABSTRACT

PROPOSE: The proposed deep learning model with a mask region-based convolutional neural network (Mask R-CNN) can predict choroidal thickness automatically. Changes in choroidal thickness with age can be detected with manual measurements. In this study, we aimed to investigate choroidal thickness in a comprehensive aspect in healthy eyes by utilizing the Mask R-CNN model. METHODS: A total of 68 eyes from 57 participants without significant ocular disease were recruited. The participants were allocated to one of three groups according to their age and underwent spectral domain optical coherence tomography (SD-OCT) or enhanced depth imaging OCT (EDI-OCT) centered on the fovea. Each OCT sequence included 25 slices. Physicians labeled the choroidal contours in all the OCT sequences. We applied the Mask R-CNN model for automatic segmentation. Comparisons of choroidal thicknesses were conducted according to age and prediction accuracy. RESULTS: Older age groups had thinner choroids, according to the automatic segmentation results; the mean choroidal thickness was 253.7 ± 41.9 µm in the youngest group, 206.8 ± 35.4 µm in the middle-aged group, and 152.5 ± 45.7 µm in the oldest group (p < 0.01). Measurements obtained using physician sketches demonstrated similar trends. We observed a significant negative correlation between choroidal thickness and age (p < 0.01). The prediction error was lower and less variable in choroids that were thinner than the cutoff point of 280 µm. CONCLUSION: By observing choroid layer continuously and comprehensively. We found that the mean choroidal thickness decreased with age in healthy subjects. The Mask R-CNN model can accurately predict choroidal thickness, especially choroids thinner than 280 µm. This model can enable exploring larger and more varied choroid datasets comprehensively, automatically, and conveniently.


Subject(s)
Deep Learning , Aged , Choroid , Fovea Centralis , Healthy Volunteers , Humans , Middle Aged , Tomography, Optical Coherence/methods
17.
Sensors (Basel) ; 23(1)2022 Dec 24.
Article in English | MEDLINE | ID: mdl-36616773

ABSTRACT

Abdominal aortic aneurysm (AAA) is a fatal clinical condition with high mortality. Computed tomography angiography (CTA) imaging is the preferred minimally invasive modality for the long-term postoperative observation of AAA. Accurate segmentation of the thrombus region of interest (ROI) in a postoperative CTA image volume is essential for quantitative assessment and rapid clinical decision making by clinicians. Few investigators have proposed the adoption of convolutional neural networks (CNN). Although these methods demonstrated the potential of CNN architectures by automating the thrombus ROI segmentation, the segmentation performance can be further improved. The existing methods performed the segmentation process independently per 2D image and were incapable of using adjacent images, which could be useful for the robust segmentation of thrombus ROIs. In this work, we propose a thrombus ROI segmentation method to utilize not only the spatial features of a target image, but also the volumetric coherence available from adjacent images. We newly adopted a recurrent neural network, bi-directional convolutional long short-term memory (Bi-CLSTM) architecture, which can learn coherence between a sequence of data. This coherence learning capability can be useful for challenging situations, for example, when the target image exhibits inherent postoperative artifacts and noises, the inclusion of adjacent images would facilitate learning more robust features for thrombus ROI segmentation. We demonstrate the segmentation capability of our Bi-CLSTM-based method with a comparison of the existing 2D-based thrombus ROI segmentation counterpart as well as other established 2D- and 3D-based alternatives. Our comparison is based on a large-scale clinical dataset of 60 patient studies (i.e., 60 CTA image volumes). The results suggest the superior segmentation performance of our Bi-CLSTM-based method by achieving the highest scores of the evaluation metrics, e.g., our Bi-CLSTM results were 0.0331 higher on total overlap and 0.0331 lower on false negative when compared to 2D U-net++ as the second-best.


Subject(s)
Computed Tomography Angiography , Thrombosis , Humans , Computed Tomography Angiography/methods , Memory, Short-Term , Tomography, X-Ray Computed , Neural Networks, Computer , Thrombosis/diagnostic imaging , Image Processing, Computer-Assisted/methods
18.
Sensors (Basel) ; 21(13)2021 Jun 26.
Article in English | MEDLINE | ID: mdl-34206768

ABSTRACT

This research investigated real-time fingertip detection in frames captured from the increasingly popular wearable device, smart glasses. The egocentric-view fingertip detection and character recognition can be used to create a novel way of inputting texts. We first employed Unity3D to build a synthetic dataset with pointing gestures from the first-person perspective. The obvious benefits of using synthetic data are that they eliminate the need for time-consuming and error-prone manual labeling and they provide a large and high-quality dataset for a wide range of purposes. Following that, a modified Mask Regional Convolutional Neural Network (Mask R-CNN) is proposed, consisting of a region-based CNN for finger detection and a three-layer CNN for fingertip location. The process can be completed in 25 ms per frame for 640×480 RGB images, with an average error of 8.3 pixels. The speed is high enough to enable real-time "air-writing", where users are able to write characters in the air to input texts or commands while wearing smart glasses. The characters can be recognized by a ResNet-based CNN from the fingertip trajectories. Experimental results demonstrate the feasibility of this novel methodology.


Subject(s)
Gestures , Neural Networks, Computer , Humans , Writing
19.
Ann Transl Med ; 9(24): 1768, 2021 Dec.
Article in English | MEDLINE | ID: mdl-35071462

ABSTRACT

BACKGROUND: Liver segmentation in computed tomography (CT) imaging has been widely investigated as a crucial step for analyzing liver characteristics and diagnosing liver diseases. However, obtaining satisfactory liver segmentation performance is highly challenging because of the poor contrast between the liver and its surrounding organs and tissues, the high levels of CT image noise, and the wide variability in liver shapes among patients. METHODS: To overcome these challenges, we propose a novel method for liver segmentation in CT image sequences. This method uses an enhanced mask region-based convolutional neural network (Mask R-CNN) with graph-cut segmentation. Specifically, the k-nearest neighbor (k-NN) algorithm is employed to cluster the target liver pixels in order to get an appropriate aspect ratio. Then, anchors are adapted to the liver size using the ratio information. Thus, high-accuracy liver localization can be achieved using the anchors and rotation-invariant object recognition. Next, a fully convolutional network (FCN) is used to segment the foreground objects, and local fine-grained liver detection is realized by pixel prediction. Finally, a whole liver mask is obtained by Mask R-CNN proposed in this paper. RESULTS: We proposed a Mask R-CNN algorithm which achieved superior performance in comparison with the conventional Mask R-CNN algorithms in term of the dice similarity coefficient (DSC), and the Medical Image Computing and Computer-Assisted Intervention (MICCAI) metrics. CONCLUSIONS: Our experimental results demonstrate that the improved Mask R-CNN architecture has good performance, accuracy, and robustness for liver segmentation in CT image sequences.

20.
Ophthalmol Sci ; 1(4): 100060, 2021 Dec.
Article in English | MEDLINE | ID: mdl-36246938

ABSTRACT

Purpose: Retinal toxicity resulting from hydroxychloroquine use manifests photoreceptor loss and disruption of the ellipsoid zone (EZ) reflectivity band detectable on spectral-domain (SD) OCT imaging. This study investigated whether an automatic deep learning-based algorithm can detect and quantitate EZ loss on SD OCT images with an accuracy comparable with that of human annotations. Design: Retrospective analysis of data acquired in a prospective, single-center, case-control study. Participants: Eighty-five patients (168 eyes) who were long-term hydroxychloroquine users (average exposure time, 14 ± 7.2 years). Methods: A mask region-based convolutional neural network (M-RCNN) was implemented and trained on individual OCT B-scans. Scan-by-scan detections were aggregated to produce an en face map of EZ loss per 3-dimensional SD OCT volume image. To improve the accuracy and robustness of the EZ loss map, a dual network architecture was proposed that learns to detect EZ loss in parallel using horizontal (horizontal mask region-based convolutional neural network [M-RCNNH]) and vertical (vertical mask region-based convolutional neural network [M-RCNNV]) B-scans independently. To quantify accuracy, 10-fold cross-validation was performed. Main Outcome Measures: Precision, recall, intersection over union (IOU), F1-score metrics, and measured total EZ loss area were compared against human grader annotations and with the determination of toxicity based on the recommended screening guidelines. Results: The combined projection network demonstrated the best overall performance: precision, 0.90 ± 0.09; recall, 0.88 ± 0.08; and F1 score, 0.89 ± 0.07. The combined model performed superiorly to the M-RCNNH only model (precision, 0.79 ± 0.17; recall, 0.96 ± 0.04; IOU, 0.78 ± 0.15; and F1 score, 0.86 ± 0.12) and M-RCNNV only model (precision, 0.71 ± 0.21; recall, 0.94 ± 0.06; IOU, 0.69 ± 0.21; and F1 score, 0.79 ± 0.16). The accuracy was comparable with the variability of human experts: precision, 0.85 ± 0.09; recall, 0.98 ± 0.01; IOU, 0.82 ± 0.12; and F1 score, 0.91 ± 0.06. Automatically generated en face EZ loss maps provide quantitative SD OCT metrics for accurate toxicity determination combined with other functional testing. Conclusions: The algorithm can provide a fast, objective, automatic method for measuring areas with EZ loss and can serve as a quantitative assistance tool to screen patients for the presence and extent of toxicity.

SELECTION OF CITATIONS
SEARCH DETAIL
...