Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 82
Filter
1.
Comput Methods Programs Biomed ; 254: 108259, 2024 Jun 06.
Article in English | MEDLINE | ID: mdl-38865795

ABSTRACT

BACKGROUND AND OBJECTIVE: Alzheimer's disease (AD) is a dreaded degenerative disease that results in a profound decline in human cognition and memory. Due to its intricate pathogenesis and the lack of effective therapeutic interventions, early diagnosis plays a paramount role in AD. Recent research based on neuroimaging has shown that the application of deep learning methods by multimodal neural images can effectively detect AD. However, these methods only concatenate and fuse the high-level features extracted from different modalities, ignoring the fusion and interaction of low-level features across modalities. It consequently leads to unsatisfactory classification performance. METHOD: In this paper, we propose a novel multi-scale attention and cross-enhanced fusion network, MACFNet, which enables the interaction of multi-stage low-level features between inputs to learn shared feature representations. We first construct a novel Cross-Enhanced Fusion Module (CEFM), which fuses low-level features from different modalities through a multi-stage cross-structure. In addition, an Efficient Spatial Channel Attention (ECSA) module is proposed, which is able to focus on important AD-related features in images more efficiently and achieve feature enhancement from different modalities through two-stage residual concatenation. Finally, we also propose a multiscale attention guiding block (MSAG) based on dilated convolution, which can obtain rich receptive fields without increasing model parameters and computation, and effectively improve the efficiency of multiscale feature extraction. RESULTS: Experiments on the Alzheimer's Disease Neuroimaging Initiative (ADNI) dataset demonstrate that our MACFNet has better classification performance than existing multimodal methods, with classification accuracies of 99.59 %, 98.85 %, 99.61 %, and 98.23 % for AD vs. CN, AD vs. MCI, CN vs. MCI and AD vs. CN vs. MCI, respectively, and specificity of 98.92 %, 97.07 %, 99.58 % and 99.04 %, and sensitivity of 99.91 %, 99.89 %, 99.63 % and 97.75 %, respectively. CONCLUSIONS: The proposed MACFNet is a high-accuracy multimodal AD diagnostic framework. Through the cross mechanism and efficient attention, MACFNet can make full use of the low-level features of different modal medical images and effectively pay attention to the local and global information of the images. This work provides a valuable reference for multi-mode AD diagnosis.

2.
Front Surg ; 11: 1389244, 2024.
Article in English | MEDLINE | ID: mdl-38903864

ABSTRACT

Background: Surgical robots are gaining increasing popularity because of their capability to improve the precision of pedicle screw placement. However, current surgical robots rely on unimodal computed tomography (CT) images as baseline images, limiting their visualization to vertebral bone structures and excluding soft tissue structures such as intervertebral discs and nerves. This inherent limitation significantly restricts the applicability of surgical robots. To address this issue and further enhance the safety and accuracy of robot-assisted pedicle screw placement, this study will develop a software system for surgical robots based on multimodal image fusion. Such a system can extend the application range of surgical robots, such as surgical channel establishment, nerve decompression, and other related operations. Methods: Initially, imaging data of the patients included in the study are collected. Professional workstations are employed to establish, train, validate, and optimize algorithms for vertebral bone segmentation in CT and magnetic resonance (MR) images, intervertebral disc segmentation in MR images, nerve segmentation in MR images, and registration fusion of CT and MR images. Subsequently, a spine application model containing independent modules for vertebrae, intervertebral discs, and nerves is constructed, and a software system for surgical robots based on multimodal image fusion is designed. Finally, the software system is clinically validated. Discussion: We will develop a software system based on multimodal image fusion for surgical robots, which can be applied to surgical access establishment, nerve decompression, and other operations not only for robot-assisted nail placement. The development of this software system is important. First, it can improve the accuracy of pedicle screw placement, percutaneous vertebroplasty, percutaneous kyphoplasty, and other surgeries. Second, it can reduce the number of fluoroscopies, shorten the operation time, and reduce surgical complications. In addition, it would be helpful to expand the application range of surgical robots by providing key imaging data for surgical robots to realize surgical channel establishment, nerve decompression, and other operations.

3.
J Craniomaxillofac Surg ; 52(5): 659-665, 2024 May.
Article in English | MEDLINE | ID: mdl-38580555

ABSTRACT

Precise recognition of the intraparotid facial nerve (IFN) is crucial during parotid tumor resection. We aimed to explore the application effect of direct visualization of the IFN in parotid tumor resection. Fifteen patients with parotid tumors were enrolled in this study and underwent specific radiological scanning in which the IFNs were displayed as high-intensity images. After image segmentation, IFN could be preoperatively directly visualized. Mixed reality combined with surgical navigation were applied to intraoperatively directly visualize the segmentation results as real-time three-dimensional holograms, guiding the surgeons in IFN dissection and tumor resection. Radiological visibility of the IFN, accuracy of image segmentation and postoperative facial nerve function were analyzed. The trunks of IFN were directly visible in radiological images for all patients. Of 37 landmark points on the IFN, 36 were accurately segmented. Four patients were classified as House-Brackmann Grade I postoperatively. Two patients with malignancies had postoperative long-standing facial paralysis. Direct visualization of IFN was a feasible novel method with high accuracy that could assist in recognition of IFN and therefore potentially improve the treatment outcome of parotid tumor resection.


Subject(s)
Facial Nerve , Parotid Neoplasms , Humans , Parotid Neoplasms/surgery , Parotid Neoplasms/diagnostic imaging , Facial Nerve/diagnostic imaging , Female , Male , Middle Aged , Adult , Aged , Imaging, Three-Dimensional/methods , Surgery, Computer-Assisted/methods , Tomography, X-Ray Computed/methods , Parotid Gland/surgery , Parotid Gland/diagnostic imaging , Young Adult
4.
R Soc Open Sci ; 11(4)2024 Apr.
Article in English | MEDLINE | ID: mdl-38601031

ABSTRACT

With the rapid development of medical imaging methods, multimodal medical image fusion techniques have caught the interest of researchers. The aim is to preserve information from diverse sensors using various models to generate a single informative image. The main challenge is to derive a trade-off between the spatial and spectral qualities of the resulting fused image and the computing efficiency. This article proposes a fast and reliable method for medical image fusion depending on multilevel Guided edge-preserving filtering (MLGEPF) decomposition rule. First, each multimodal medical image was divided into three sublayer categories using an MLGEPF decomposition scheme: small-scale component, large-scale component and background component. Secondly, two fusion strategies-pulse-coupled neural network based on the structure tensor and maximum based-are applied to combine the three types of layers, based on the layers' various properties. The three different types of fused sublayers are combined to create the fused image at the end. A total of 40 pairs of brain images from four separate categories of medical conditions were tested in experiments. The pair of images includes various case studies including magnetic resonance imaging (MRI) , TITc, single-photon emission computed tomography (SPECT) and positron emission tomography (PET). We included qualitative analysis to demonstrate that the visual contrast between the structure and the surrounding tissue is increased in our proposed method. To further enhance the visual comparison, we asked a group of observers to compare our method's outputs with other methods and score them. Overall, our proposed fusion scheme increased the visual contrast and received positive subjective review. Moreover, objective assessment indicators for each category of medical conditions are also included. Our method achieves a high evaluation outcome on feature mutual information (FMI), the sum of correlation of differences (SCD), Qabf and Qy indexes. This implies that our fusion algorithm has better performance in information preservation and efficient structural and visual transferring.

5.
Arch. Soc. Esp. Oftalmol ; 99(4): 165-168, abr. 2024. ilus
Article in Spanish | IBECS | ID: ibc-232137

ABSTRACT

La cavitación intracoroidea es un hallazgo identificado con OCT descrito inicialmente en pacientes miopes, pero también aparece en pacientes no miopes. Puede presentarse tanto en el área peripapilar como en el polo posterior. El coloboma macular es un defecto del desarrollo embrionario del polo posterior, y en la OCT estructural es imprescindible la ausencia del epitelio pigmentario de la retina y de los vasos coroideos para su diagnóstico. Este caso presenta la cavitación intracoroidea circunscribiendo el coloboma macular, en ausencia de membrana intercalar. La imagen en face permite valorar la relación entre ambas estructuras, así como la magnitud de las mismas. (AU)


Intrachoroidal cavitation is a finding identified with OCT initially described in myopic patients, it also appears in non-myopic patients. It can occur in both the peripapillary area and the posterior pole. Macular coloboma is a defect of embryonic development of the posterior pole, in structural OCT the absence of the retinal pigment epithelium and choroidal vessels is essential. In this case, intrachoroidal cavitation circumscribes the macular coloboma, in the absence of an intercalary membrane. The face image allows us to assess the relationship between the two structures as well as their magnitude. (AU)


Subject(s)
Humans , Coloboma , Tomography , Myopia, Degenerative , Cavitation , Ophthalmology
6.
Sensors (Basel) ; 24(7)2024 Apr 03.
Article in English | MEDLINE | ID: mdl-38610501

ABSTRACT

Multimodal sensors capture and integrate diverse characteristics of a scene to maximize information gain. In optics, this may involve capturing intensity in specific spectra or polarization states to determine factors such as material properties or an individual's health conditions. Combining multimodal camera data with shape data from 3D sensors is a challenging issue. Multimodal cameras, e.g., hyperspectral cameras, or cameras outside the visible light spectrum, e.g., thermal cameras, lack strongly in terms of resolution and image quality compared with state-of-the-art photo cameras. In this article, a new method is demonstrated to superimpose multimodal image data onto a 3D model created by multi-view photogrammetry. While a high-resolution photo camera captures a set of images from varying view angles to reconstruct a detailed 3D model of the scene, low-resolution multimodal camera(s) simultaneously record the scene. All cameras are pre-calibrated and rigidly mounted on a rig, i.e., their imaging properties and relative positions are known. The method was realized in a laboratory setup consisting of a professional photo camera, a thermal camera, and a 12-channel multispectral camera. In our experiments, an accuracy better than one pixel was achieved for the data fusion using multimodal superimposition. Finally, application examples of multimodal 3D digitization are demonstrated, and further steps to system realization are discussed.

7.
Brief Bioinform ; 25(2)2024 Jan 22.
Article in English | MEDLINE | ID: mdl-38483256

ABSTRACT

Numerous imaging techniques are available for observing and interrogating biological samples, and several of them can be used consecutively to enable correlative analysis of different image modalities with varying resolutions and the inclusion of structural or molecular information. Achieving accurate registration of multimodal images is essential for the correlative analysis process, but it remains a challenging computer vision task with no widely accepted solution. Moreover, supervised registration methods require annotated data produced by experts, which is limited. To address this challenge, we propose a general unsupervised pipeline for multimodal image registration using deep learning. We provide a comprehensive evaluation of the proposed pipeline versus the current state-of-the-art image registration and style transfer methods on four types of biological problems utilizing different microscopy modalities. We found that style transfer of modality domains paired with fully unsupervised training leads to comparable image registration accuracy to supervised methods and, most importantly, does not require human intervention.


Subject(s)
Deep Learning , Humans , Microscopy
8.
Chin J Traumatol ; 2024 Mar 16.
Article in English | MEDLINE | ID: mdl-38548574

ABSTRACT

PURPOSE: Although traditional craniotomy (TC) surgery has failed to show benefits for the functional outcome of intracerebral hemorrhage (ICH). However, a minimally invasive hematoma removal plan to avoid white matter fiber damage may be a safer and more feasible surgical approach, which may improve the prognosis of ICH. We conducted a historical cohort study on the use of multimodal image fusion-assisted neuroendoscopic surgery (MINS) for the treatment of ICH, and compared its safety and effectiveness with traditional methods. METHODS: This is a historical cohort study involving 241 patients with cerebral hemorrhage. Divided into MINS group and TC group based on surgical methods. Multimodal images (CT skull, CT angiography, and white matter fiber of MRI diffusion-tensor imaging) were fused into 3 dimensional images for preoperative planning and intraoperative guidance of endoscopic hematoma removal in the MINS group. Clinical features, operative efficiency, perioperative complications, and prognoses between 2 groups were compared. Normally distributed data were analyzed using t-test of 2 independent samples, Non-normally distributed data were compared using the Kruskal-Wallis test. Meanwhile categorical data were analyzed via the Chi-square test or Fisher's exact test. All statistical tests were two-sided, and p < 0.05 was considered statistically significant. RESULTS: A total of 42 patients with ICH were enrolled, who underwent TC surgery or MINS. Patients who underwent MINS had shorter operative time (p < 0.001), less blood loss (p < 0.001), better hematoma evacuation (p = 0.003), and a shorter stay in the intensive care unit (p = 0.002) than patients who underwent TC. Based on clinical characteristics and analysis of perioperative complications, there is no significant difference between the 2 surgical methods. Modified Rankin scale scores at 180 days were better in the MINS than in the TC group (p = 0.014). CONCLUSIONS: Compared with TC for the treatment of ICH, MINS is safer and more efficient in cleaning ICH, which improved the prognosis of the patients. In the future, a larger sample size clinical trial will be needed to evaluate its efficacy.

9.
BMC Med Inform Decis Mak ; 24(1): 65, 2024 Mar 05.
Article in English | MEDLINE | ID: mdl-38443881

ABSTRACT

BACKGROUND: Multimodal histology image registration is a process that transforms into a common coordinate system two or more images obtained from different microscopy modalities. The combination of information from various modalities can contribute to a comprehensive understanding of tissue specimens, aiding in more accurate diagnoses, and improved research insights. Multimodal image registration in histology samples presents a significant challenge due to the inherent differences in characteristics and the need for tailored optimization algorithms for each modality. RESULTS: We developed MMIR a cloud-based system for multimodal histological image registration, which consists of three main modules: a project manager, an algorithm manager, and an image visualization system. CONCLUSION: Our software solution aims to simplify image registration tasks with a user-friendly approach. It facilitates effective algorithm management, responsive web interfaces, supports multi-resolution images, and facilitates batch image registration. Moreover, its adaptable architecture allows for the integration of custom algorithms, ensuring that it aligns with the specific requirements of each modality combination. Beyond image registration, our software enables the conversion of segmented annotations from one modality to another.


Subject(s)
Algorithms , Software , Humans
10.
Arch Soc Esp Oftalmol (Engl Ed) ; 99(4): 165-168, 2024 Apr.
Article in English | MEDLINE | ID: mdl-38309662

ABSTRACT

Intrachoroidal cavitation is a finding identified with OCT initially described in myopic patients, it also appears in non-myopic patients. It can occur in both the peripapillary area and the posterior pole. Macular coloboma is a defect of embryonic development of the posterior pole, in structural OCT the absence of the retinal pigment epithelium and choroidal vessels is essential. In this case, intrachoroidal cavitation circumscribes the macular coloboma, in the absence of an intercalary membrane. The en face image allows us to assess the relationship between the two structures as well as their magnitude.


Subject(s)
Choroid Diseases , Coloboma , Macula Lutea/abnormalities , Myopia , Humans , Choroid/diagnostic imaging , Coloboma/diagnostic imaging , Choroid Diseases/diagnostic imaging
11.
J Imaging Inform Med ; 37(2): 575-588, 2024 Apr.
Article in English | MEDLINE | ID: mdl-38343225

ABSTRACT

Accurate delineation of the clinical target volume (CTV) is a crucial prerequisite for safe and effective radiotherapy characterized. This study addresses the integration of magnetic resonance (MR) images to aid in target delineation on computed tomography (CT) images. However, obtaining MR images directly can be challenging. Therefore, we employ AI-based image generation techniques to "intelligentially generate" MR images from CT images to improve CTV delineation based on CT images. To generate high-quality MR images, we propose an attention-guided single-loop image generation model. The model can yield higher-quality images by introducing an attention mechanism in feature extraction and enhancing the loss function. Based on the generated MR images, we propose a CTV segmentation model fusing multi-scale features through image fusion and a hollow space pyramid module to enhance segmentation accuracy. The image generation model used in this study improves the peak signal-to-noise ratio (PSNR) and structural similarity index (SSIM) from 14.87 and 0.58 to 16.72 and 0.67, respectively, and improves the feature distribution distance and learning-perception image similarity from 180.86 and 0.28 to 110.98 and 0.22, achieving higher quality image generation. The proposed segmentation method demonstrates high accuracy, compared with the FCN method, the intersection over union ratio and the Dice coefficient are improved from 0.8360 and 0.8998 to 0.9043 and 0.9473, respectively. Hausdorff distance and mean surface distance decreased from 5.5573 mm and 2.3269 mm to 4.7204 mm and 0.9397 mm, respectively, achieving clinically acceptable segmentation accuracy. Our method might reduce physicians' manual workload and accelerate the diagnosis and treatment process while decreasing inter-observer variability in identifying anatomical structures.

12.
Anal Chim Acta ; 1283: 341969, 2023 Dec 01.
Article in English | MEDLINE | ID: mdl-37977791

ABSTRACT

The integration of matrix-assisted laser desorption/ionization mass spectrometry imaging (MALDI MSI) and histology plays a pivotal role in advancing our understanding of complex heterogeneous tissues, which provides a comprehensive description of biological tissue with both wide molecule coverage and high lateral resolution. Herein, we proposed a novel strategy for the correction and registration of MALDI MSI data with hematoxylin & eosin (H&E) staining images. To overcome the challenges of discrepancies in spatial resolution towards the unification of the two imaging modalities, a deep learning-based interpolation algorithm for MALDI MSI data was constructed, which enables spatial coherence and the following orientation matching between images. Coupled with the affine transformation (AT) and the subsequent moving least squares algorithm, the two types of images from one rat brain tissue section were aligned automatically with high accuracy. Moreover, we demonstrated the practicality of the developed pipeline by projecting it to a rat cerebral ischemia-reperfusion injury model, which would help decipher the link between molecular metabolism and pathological interpretation towards microregion. This new approach offers the chance for other types of bioimaging to boost the field of multimodal image fusion.


Subject(s)
Algorithms , Microscopy , Rats , Animals , Spectrometry, Mass, Matrix-Assisted Laser Desorption-Ionization/methods , Staining and Labeling
13.
Eur J Radiol ; 169: 111189, 2023 Dec.
Article in English | MEDLINE | ID: mdl-37939605

ABSTRACT

PURPOSE: The objective of this study was to analyze the effect of TMJ disc position on condylar bone remodeling after arthroscopic disc repositioning surgery. METHODS: Nine patients with anterior disc displacement without reduction (ADDWoR, 15 sides) who underwent arthroscopic disc repositioning surgery were included. Three-dimensional (3D) reconstruction of the articular disc and the condyle in the closed-mouth position was performed using cone-beam computed tomography (CBCT) and magnetic resonance imaging (MRI) data. Then, the CBCT and MRI images were fused and displayed together by multimodal image registration techniques. Morphological changes in the articular disc and condyle, as well as changes in their spatial relationship, were studied by comparing preoperative and 3-month postoperative CBCT-MRI fused images. RESULTS: The volume and superficial area of the articular disc, as well as the area of the articular disc surface in the subarticular cavity, were significantly increased compared to that before the surgical treatment(P < 0.01). There was also a significant increase in the volume of the condyle (P < 0.001). All condyles showed bone remodeling after surgery that could be categorized as one of two types depending on the position of the articular disc, suggesting that the location of the articular disc was related to the new bone formation. CONCLUSIONS: The morphology of the articular disc and condyle were significantly changed after arthroscopic disc repositioning surgery. The 3D changes in the position of the articular disc after surgery tended to have an effect on condylar bone remodeling and the location of new bone formation.


Subject(s)
Joint Dislocations , Temporomandibular Joint Disc , Humans , Temporomandibular Joint Disc/diagnostic imaging , Temporomandibular Joint Disc/surgery , Temporomandibular Joint Disc/pathology , Bone Remodeling , Bone and Bones , Magnetic Resonance Imaging/methods , Cone-Beam Computed Tomography , Joint Dislocations/pathology , Temporomandibular Joint , Mandibular Condyle
14.
Sheng Wu Yi Xue Gong Cheng Xue Za Zhi ; 40(5): 1027-1032, 2023 Oct 25.
Article in Chinese | MEDLINE | ID: mdl-37879934

ABSTRACT

In recent years, the incidence of thyroid diseases has increased significantly and ultrasound examination is the first choice for the diagnosis of thyroid diseases. At the same time, the level of medical image analysis based on deep learning has been rapidly improved. Ultrasonic image analysis has made a series of milestone breakthroughs, and deep learning algorithms have shown strong performance in the field of medical image segmentation and classification. This article first elaborates on the application of deep learning algorithms in thyroid ultrasound image segmentation, feature extraction, and classification differentiation. Secondly, it summarizes the algorithms for deep learning processing multimodal ultrasound images. Finally, it points out the problems in thyroid ultrasound image diagnosis at the current stage and looks forward to future development directions. This study can promote the application of deep learning in clinical ultrasound image diagnosis of thyroid, and provide reference for doctors to diagnose thyroid disease.


Subject(s)
Deep Learning , Thyroid Diseases , Humans , Algorithms , Image Processing, Computer-Assisted/methods , Thyroid Diseases/diagnostic imaging , Ultrasonography
15.
Sheng Wu Yi Xue Gong Cheng Xue Za Zhi ; 40(4): 736-742, 2023 Aug 25.
Article in Chinese | MEDLINE | ID: mdl-37666764

ABSTRACT

Electrocardiogram (ECG) signal is an important basis for the diagnosis of arrhythmia and myocardial infarction. In order to further improve the classification effect of arrhythmia and myocardial infarction, an ECG classification algorithm based on Convolutional vision Transformer (CvT) and multimodal image fusion was proposed. Through Gramian summation angular field (GASF), Gramian difference angular field (GADF) and recurrence plot (RP), the one-dimensional ECG signal was converted into three different modes of two-dimensional images, and fused into a multimodal fusion image containing more features. The CvT-13 model could take into account local and global information when processing the fused image, thus effectively improving the classification performance. On the MIT-BIH arrhythmia dataset and the PTB myocardial infarction dataset, the algorithm achieved a combined accuracy of 99.9% for the classification of five arrhythmias and 99.8% for the classification of myocardial infarction. The experiments show that the high-precision computer-assisted intelligent classification method is superior and can effectively improve the diagnostic efficiency of arrhythmia as well as myocardial infarction and other cardiac diseases.


Subject(s)
Heart Diseases , Myocardial Infarction , Humans , Electrocardiography , Myocardial Infarction/diagnostic imaging , Algorithms , Electric Power Supplies
16.
Cell Rep Methods ; 3(10): 100595, 2023 Oct 23.
Article in English | MEDLINE | ID: mdl-37741277

ABSTRACT

Imaging mass cytometry (IMC) is a powerful technique capable of detecting over 30 markers on a single slide. It has been increasingly used for single-cell-based spatial phenotyping in a wide range of samples. However, it only acquires a rectangle field of view (FOV) with a relatively small size and low image resolution, which hinders downstream analysis. Here, we reported a highly practical dual-modality imaging method that combines high-resolution immunofluorescence (IF) and high-dimensional IMC on the same tissue slide. Our computational pipeline uses the whole-slide image (WSI) of IF as a spatial reference and integrates small-FOV IMC into a WSI of IMC. The high-resolution IF images enable accurate single-cell segmentation to extract robust high-dimensional IMC features for downstream analysis. We applied this method in esophageal adenocarcinoma of different stages, identified the single-cell pathology landscape via reconstruction of WSI IMC images, and demonstrated the advantage of the dual-modality imaging strategy.


Subject(s)
Adenocarcinoma , Barrett Esophagus , Esophageal Neoplasms , Humans , Barrett Esophagus/pathology , Esophageal Neoplasms/pathology , Adenocarcinoma/diagnostic imaging , Fluorescent Antibody Technique , Image Cytometry
18.
Front Plant Sci ; 14: 1224884, 2023.
Article in English | MEDLINE | ID: mdl-37534292

ABSTRACT

Introduction: The difficulties in tea shoot recognition are that the recognition is affected by lighting conditions, it is challenging to segment images with similar backgrounds to the shoot color, and the occlusion and overlap between leaves. Methods: To solve the problem of low accuracy of dense small object detection of tea shoots, this paper proposes a real-time dense small object detection algorithm based on multimodal optimization. First, RGB, depth, and infrared images are collected form a multimodal image set, and a complete shoot object labeling is performed. Then, the YOLOv5 model is improved and applied to dense and tiny tea shoot detection. Secondly, based on the improved YOLOv5 model, this paper designs two data layer-based multimodal image fusion methods and a feature layerbased multimodal image fusion method; meanwhile, a cross-modal fusion module (FFA) based on frequency domain and attention mechanisms is designed for the feature layer fusion method to adaptively align and focus critical regions in intra- and inter-modal channel and frequency domain dimensions. Finally, an objective-based scale matching method is developed to further improve the detection performance of small dense objects in natural environments with the assistance of transfer learning techniques. Results and discussion: The experimental results indicate that the improved YOLOv5 model increases the mAP50 value by 1.7% compared to the benchmark model with fewer parameters and less computational effort. Compared with the single modality, the multimodal image fusion method increases the mAP50 value in all cases, with the method introducing the FFA module obtaining the highest mAP50 value of 0.827. After the pre-training strategy is used after scale matching, the mAP values can be improved by 1% and 1.4% on the two datasets. The research idea of multimodal optimization in this paper can provide a basis and technical support for dense small object detection.

19.
J Appl Clin Med Phys ; 24(8): e14084, 2023 Aug.
Article in English | MEDLINE | ID: mdl-37430473

ABSTRACT

Retrograde intrarenal surgery (RIRS) is a widely utilized diagnostic and therapeutic tool for multiple upper urinary tract pathologies. The image-guided navigation system can assist the surgeon to perform precise surgery by providing the relative position between the lesion and the instrument after the intraoperative image is registered with the preoperative model. However, due to the structural complexity and diversity of multi-branched organs such as kidneys, bronchi, etc., the consistency of the intensity distribution of virtual and real images will be challenged, which makes the classical pure intensity registration method prone to bias and random results in a wide search domain. In this paper, we propose a structural feature similarity-based method combined with a semantic style transfer network, which significantly improves the registration accuracy when the initial state deviation is obvious. Furthermore, multi-view constraints are introduced to compensate for the collapse of spatial depth information and improve the robustness of the algorithm. Experimental studies were conducted on two models generated from patient data to evaluate the performance of the method and competing algorithms. The proposed method obtains mean target error (mTRE) of 0.971 ± 0.585 mm and 1.266 ± 0.416 mm respectively, with better accuracy and robustness overall. Experimental results demonstrate that the proposed method has the potential to be applied to RIRS and extended to other organs with similar structures.


Subject(s)
Algorithms , Imaging, Three-Dimensional , Humans , Imaging, Three-Dimensional/methods , Phantoms, Imaging
20.
Front Robot AI ; 10: 1120357, 2023.
Article in English | MEDLINE | ID: mdl-37008984

ABSTRACT

The concept of Industry 4.0 brings the change of industry manufacturing patterns that become more efficient and more flexible. In response to this tendency, an efficient robot teaching approach without complex programming has become a popular research direction. Therefore, we propose an interactive finger-touch based robot teaching schema using a multimodal 3D image (color (RGB), thermal (T) and point cloud (3D)) processing. Here, the resulting heat trace touching the object surface will be analyzed on multimodal data, in order to precisely identify the true hand/object contact points. These identified contact points are used to calculate the robot path directly. To optimize the identification of the contact points we propose a calculation scheme using a number of anchor points which are first predicted by hand/object point cloud segmentation. Subsequently a probability density function is defined to calculate the prior probability distribution of true finger trace. The temperature in the neighborhood of each anchor point is then dynamically analyzed to calculate the likelihood. Experiments show that the trajectories estimated by our multimodal method have significantly better accuracy and smoothness than only by analyzing point cloud and static temperature distribution.

SELECTION OF CITATIONS
SEARCH DETAIL
...