Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 41
Filtrar
1.
Opt Express ; 32(7): 11429-11446, 2024 Mar 25.
Artigo em Inglês | MEDLINE | ID: mdl-38570991

RESUMO

Fourier ptychographic microscopy (FPM) is an enabling quantitative phase imaging technique with both high-resolution (HR) and wide field-of-view (FOV), which can surpass the diffraction limit of the objective lens by employing an LED array to provide angular-varying illumination. The precise illumination angles are critical to ensure exact reconstruction, while it's difficult to separate actual positional parameters in conventional algorithmic self-calibration approaches due to the mixing of multiple systematic error sources. In this paper, we report a pupil-function-based strategy for independently calibrating the position of LED array. We first deduce the relationship between positional deviation and pupil function in the Fourier domain through a common iterative route. Then, we propose a judgment criterion to determine the misalignment situations, which is based on the arrangement of LED array in the spatial domain. By combining the mapping of complex domains, we can accurately solve the spatial positional parameters concerning the LED array through a boundary-finding scheme. Relevant simulations and experiments demonstrate the proposed method is accessible to precisely correct the positional misalignment of LED array. The approach based on the pupil function is expected to provide valuable insights for precise position correction in the field of microscopy.

2.
IEEE Trans Pattern Anal Mach Intell ; 46(5): 2852-2865, 2024 May.
Artigo em Inglês | MEDLINE | ID: mdl-37991906

RESUMO

The perception of drones, also known as Unmanned Aerial Vehicles (UAVs), particularly in infrared videos, is crucial for effective anti-UAV tasks. However, existing datasets for UAV tracking have limitations in terms of target size and attribute distribution characteristics, which do not fully represent complex realistic scenes. To address this issue, we introduce a generalized infrared UAV tracking benchmark called Anti-UAV410. The benchmark comprises a total of 410 videos with over 438 K manually annotated bounding boxes. To tackle the challenges of UAV tracking in complex environments, we propose a novel method called Siamese drone tracker (SiamDT). SiamDT incorporates a dual-semantic feature extraction mechanism that explicitly models targets in dynamic background clutter, enabling effective tracking of small UAVs. The SiamDT method consists of three key steps: Dual-Semantic RPN Proposals (DS-RPN), Versatile R-CNN (VR-CNN), and Background Distractors Suppression. These steps are responsible for generating candidate proposals, refining prediction scores based on dual-semantic features, and enhancing the discriminative capacity of the trackers against dynamic background clutter, respectively. Extensive experiments conducted on the Anti-UAV410 dataset and three other large-scale benchmarks demonstrate the superior performance of the proposed SiamDT method compared to recent state-of-the-art trackers.

3.
IEEE Trans Pattern Anal Mach Intell ; 46(5): 3736-3752, 2024 May.
Artigo em Inglês | MEDLINE | ID: mdl-38133980

RESUMO

Accurate 3D object detection in large-scale outdoor scenes, characterized by considerable variations in object scales, necessitates features rich in both long-range and fine-grained information. While recent detectors have utilized window-based transformers to model long-range dependencies, they tend to overlook fine-grained details. To bridge this gap, we propose MsSVT++, an innovative Mixed-scale Sparse Voxel Transformer that simultaneously captures both types of information through a divide-and-conquer approach. This approach involves explicitly dividing attention heads into multiple groups, each responsible for attending to information within a specific range. The outputs of these groups are subsequently merged to obtain final mixed-scale features. To mitigate the computational complexity associated with applying a window-based transformer in 3D voxel space, we introduce a novel Chessboard Sampling strategy and implement voxel sampling and gathering operations sparsely using a hash map. Moreover, an important challenge stems from the observation that non-empty voxels are primarily located on the surface of objects, which impedes the accurate estimation of bounding boxes. To overcome this challenge, we introduce a Center Voting module that integrates newly voted voxels enriched with mixed-scale contextual information towards the centers of the objects, thereby improving precise object localization. Extensive experiments demonstrate that our single-stage detector, built upon the foundation of MsSVT++, consistently delivers exceptional performance across diverse datasets.

4.
Artigo em Inglês | MEDLINE | ID: mdl-38153834

RESUMO

Transformers have astounding representational power but typically consume considerable computation which is quadratic with image resolution. The prevailing Swin transformer reduces computational costs through a local window strategy. However, this strategy inevitably causes two drawbacks: 1) the local window-based self-attention (WSA) hinders global dependency modeling capability and 2) recent studies point out that local windows impair robustness. To overcome these challenges, we pursue a preferable trade-off between computational cost and performance. Accordingly, we propose a novel factorization self-attention (FaSA) mechanism that enjoys both the advantages of local window cost and long-range dependency modeling capability. By factorizing the conventional attention matrix into sparse subattention matrices, FaSA captures long-range dependencies, while aggregating mixed-grained information at a computational cost equivalent to the local WSA. Leveraging FaSA, we present the factorization vision transformer (FaViT) with a hierarchical structure. FaViT achieves high performance and robustness, with linear computational complexity concerning input image spatial resolution. Extensive experiments have shown FaViT's advanced performance in classification and downstream tasks. Furthermore, it also exhibits strong model robustness to corrupted and biased data and hence demonstrates benefits in favor of practical applications. In comparison to the baseline model Swin-T, our FaViT-B2 significantly improves classification accuracy by 1% and robustness by 7% , while reducing model parameters by 14% . Our code will soon be publicly available: at https://github.com/q2479036243/FaViT.

5.
IEEE J Biomed Health Inform ; 27(11): 5542-5553, 2023 11.
Artigo em Inglês | MEDLINE | ID: mdl-37669209

RESUMO

In medical image analysis, blood vessel segmentation is of considerable clinical value for diagnosis and surgery. The predicaments of complex vascular structures obstruct the development of the field. Despite many algorithms have emerged to get off the tight corners, they rely excessively on careful annotations for tubular vessel extraction. A practical solution is to excavate the feature information distribution from unlabeled data. This work proposes a novel semi-supervised vessel segmentation framework, named EXP-Net, to navigate through finite annotations. Based on the training mechanism of the Mean Teacher model, we innovatively engage an expert network in EXP-Net to enhance knowledge distillation. The expert network comprises knowledge and connectivity enhancement modules, which are respectively in charge of modeling feature relationships from global and detailed perspectives. In particular, the knowledge enhancement module leverages the vision transformer to highlight the long-range dependencies among multi-level token components; the connectivity enhancement module maximizes the properties of topology and geometry by skeletonizing the vessel in a non-parametric manner. The key components are dedicated to the conditions of weak vessel connectivity and poor pixel contrast. Extensive evaluations show that our EXP-Net achieves state-of-the-art performance on subcutaneous vessel, retinal vessel, and coronary artery segmentations.


Assuntos
Algoritmos , Vasos Retinianos , Humanos , Vasos Coronários , Fontes de Energia Elétrica , Conhecimento , Processamento de Imagem Assistida por Computador
6.
IEEE Trans Pattern Anal Mach Intell ; 45(12): 14420-14434, 2023 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-37665707

RESUMO

Label noise and class imbalance are common challenges encountered in real-world datasets. Existing approaches for robust learning often focus on addressing either label noise or class imbalance individually, resulting in suboptimal performance when both biases are present. To bridge this gap, this work introduces a novel meta-learning-based dynamic loss that adapts the objective functions during the training process to effectively learn a classifier from long-tailed noisy data. Specifically, our dynamic loss consists of two components: a label corrector and a margin generator. The label corrector is responsible for correcting noisy labels, while the margin generator generates per-class classification margins by capturing the underlying data distribution and the learning state of the classifier. In addition, we employ a hierarchical sampling strategy that enriches a small amount of unbiased metadata with diverse and challenging samples. This enables the joint optimization of the two components in the dynamic loss through meta-learning, allowing the classifier to effectively adapt to clean and balanced test data. Extensive experiments conducted on multiple real-world and synthetic datasets with various types of data biases, including CIFAR-10/100, Animal-10N, ImageNet-LT, and Webvision, demonstrate that our method achieves state-of-the-art accuracy.

7.
Sci Data ; 10(1): 328, 2023 05 27.
Artigo em Inglês | MEDLINE | ID: mdl-37244913

RESUMO

Polarization multispectral imaging (PMI) has been applied widely with the ability of characterizing physicochemical properties of objects. However, traditional PMI relies on scanning each domain, which is time-consuming and occupies vast storage resources. Therefore, it is imperative to develop advanced PMI methods to facilitate real-time and cost-effective applications. In addition, PMI development is inseparable from preliminary simulations based on full-Stokes polarization multispectral images (FSPMI). Whereas, FSPMI measurements are always necessary due to the lack of relevant databases, which is extremely complex and severely limits PMI development. In this paper, we therefore publicize abundant FSPMI with 512 × 512 spatial pixels measured by an established system for 67 stereoscopic objects. In the system, a quarter-wave plate and a linear polarizer are rotated to modulate polarization information, while bandpass filters are switched to modulate spectral information. The required FSPMI are finally calculated from designed 5 polarization modulation and 18 spectral modulation. The publicly available FSPMI database may have the potential to greatly promote PMI development and application.

8.
Artigo em Inglês | MEDLINE | ID: mdl-37037238

RESUMO

Noisy labels, inevitably existing in pseudo-segmentation labels generated from weak object-level annotations, severely hamper model optimization for semantic segmentation. Previous works often rely on massive handcrafted losses and carefully tuned hyperparameters to resist noise, suffering poor generalization capability and high model complexity. Inspired by recent advances in meta-learning, we argue that rather than struggling to tolerate noise hidden behind clean labels passively, a more feasible solution would be to find out the noisy regions actively, so as to simply ignore them during model optimization. With this in mind, this work presents a novel meta-learning-based semantic segmentation method, MetaSeg, that comprises a primary content-aware meta-net (CAM-Net) to serve as a noise indicator for an arbitrary segmentation model counterpart. Specifically, CAM-Net learns to generate pixel-wise weights to suppress noisy regions with incorrect pseudo-labels while highlighting clean ones by exploiting hybrid strengthened features from image content, providing straightforward and reliable guidance for optimizing the segmentation model. Moreover, to break the barrier of time-consuming training when applying meta-learning to common large segmentation models, we further present a new decoupled training strategy that optimizes different model layers in a divide-and-conquer manner. Extensive experiments on object, medical, remote sensing, and human segmentation show that our method achieves superior performance, approaching that of fully supervised settings, which paves a new promising way for omni-supervised semantic segmentation.

9.
IEEE Trans Neural Netw Learn Syst ; 34(9): 6132-6145, 2023 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-34941528

RESUMO

The tracking performance of discriminative correlation filters (DCFs) is often subject to unwanted boundary effects. Many attempts have already been made to address the above issue by enlarging searching regions over the last years. However, introducing excessive background information makes the discriminative filter prone to learn from the surrounding context rather than the target. In this article, we propose a novel context restrained correlation tracking filter (CRCTF) that can effectively suppress background interference via incorporating high-quality adversarial generative negative instances. Concretely, we first construct an adversarial context generation network to simulate the central target area with surrounding background information at the initial frame. Then, we suggest a coarse background estimation network to accelerate the background generation in subsequent frames. By introducing a suppression convolution term, we utilize generative background patches to reformulate the original ridge regression objective through circulant property of correlation and a cropping operator. Finally, our tracking filter is efficiently solved by the alternating direction method of multipliers (ADMM). CRCTF demonstrates the accuracy performance on par with several well-established and highly optimized baselines on multiple challenging tracking datasets, verifying the effectiveness of our proposed approach.

10.
IEEE Trans Med Imaging ; 42(9): 2476-2489, 2023 09.
Artigo em Inglês | MEDLINE | ID: mdl-35862338

RESUMO

Automatic subcutaneous vessel imaging with near-infrared (NIR) optical apparatus can promote the accuracy of locating blood vessels, thus significantly contributing to clinical venipuncture research. Though deep learning models have achieved remarkable success in medical image segmentation, they still struggle in the subfield of subcutaneous vessel segmentation due to the scarcity and low-quality of annotated data. To relieve it, this work presents a novel semi-supervised learning framework, SCANet, that achieves accurate vessel segmentation through an alternate training strategy. The SCANet is composed of a multi-scale recurrent neural network that embeds coarse-to-fine features and two auxiliary branches, a consistency decoder and an adversarial learning branch, responsible for strengthening fine-grained details and eliminating differences between ground-truths and predictions, respectively. Equipped with a novel semi-supervised alternate training strategy, the three components work collaboratively, enabling SCANet to accurately segment vessel regions with only a handful of labeled data and abounding unlabeled data. Moreover, to mitigate the shortage of annotated data in this field, we provide a new subcutaneous vessel dataset, VESSEL-NIR. Extensive experiments on a wide variety of tasks, including the segmentation of subcutaneous vessels, retinal vessels, and skin lesions, well demonstrate the superiority and generality of our approach.


Assuntos
Redes Neurais de Computação , Vasos Retinianos , Vasos Retinianos/diagnóstico por imagem , Aprendizado de Máquina Supervisionado , Processamento de Imagem Assistida por Computador/métodos
11.
Sensors (Basel) ; 22(24)2022 Dec 13.
Artigo em Inglês | MEDLINE | ID: mdl-36560157

RESUMO

With the continuous progress of development, deep learning has made good progress in the analysis and recognition of images, which has also triggered some researchers to explore the area of combining deep learning with hyperspectral medical images and achieve some progress. This paper introduces the principles and techniques of hyperspectral imaging systems, summarizes the common medical hyperspectral imaging systems, and summarizes the progress of some emerging spectral imaging systems through analyzing the literature. In particular, this article introduces the more frequently used medical hyperspectral images and the pre-processing techniques of the spectra, and in other sections, it discusses the main developments of medical hyperspectral combined with deep learning for disease diagnosis. On the basis of the previous review, tne limited factors in the study on the application of deep learning to hyperspectral medical images are outlined, promising research directions are summarized, and the future research prospects are provided for subsequent scholars.


Assuntos
Aprendizado Profundo , Diagnóstico por Imagem
12.
Opt Express ; 30(11): 18415-18433, 2022 May 23.
Artigo em Inglês | MEDLINE | ID: mdl-36221643

RESUMO

The sustainable use of water resources is inseparable from water pollution detection. The sensing of toxic ammonia nitrogen in water currently requires auxiliary reagents, which may cause secondary pollution. Benefiting from the ability of substances to change light characteristics, this work proposes polarimetry-inspired feature fusion spectroscopy (PIFFS) to detect ammonia. The PIFFS system mainly includes a light source, a quarter-wave plate (QWP), a linear polarizer (LP) and a fiber spectrometer. The target light containing substance information is polarization modulated by adjusting the QWP and LP angles. Then, the Stokes parameters of target light can be calculated by appropriate modulations. The feasibility of PIFFS method to detect ammonia nitrogen is verified by experiments on both standard water samples and environmental water samples. Experimental results show that inspired by the first Stokes parameter, the fused features provide superiority in classifying ammonia concentration. The results also demonstrate the effectiveness of support vector machine-based concentration classification and random forests-based spectral selection. The interaction between light and substances ensures that the proposed PIFFS method has the potential to detect other pollutants.

13.
Micromachines (Basel) ; 13(10)2022 Sep 27.
Artigo em Inglês | MEDLINE | ID: mdl-36295962

RESUMO

The calibrator is one of the most important factors in the calibration of various laser 3D scanning instruments. The requirements for the diffuse reflection surface are emphasized in many national standards. In this study, spherical calibrator and plane calibrator comparative measurement experiments were carried out. The black ceramic standard sphere, white ceramic standard sphere, metal standard sphere, metal standard plane, and white ceramic standard plane were used to test the laser 3D scanner. In the spherical calibrator comparative measurement experiments, the results indicate that the RMS of the white ceramic spherical calibrator with a reflectance of approximately 60% is 10 times that of the metal spherical calibrator with the reflectance of approximately 15%, and the RMS of the black ceramic spherical calibrator with reflectance of approximately 11% is of the same order as the metal spherical calibrator. In the plane calibrators comparative measurement experiments, the RMS of the flatness measurement is 0.077 mm for the metal plane calibrator with a reflectance of 15%, and 2.915 mm for ceramic plane calibrator with a reflectance of 60%. The results show that when the optimal measurement distance and incident angle are selected, the reflectance of the calibrator has a great effect on the measurement results, regardless of the outlines or profiles. Based on the experiments, it is recommended to use the spherical calibrator or the standard plane with a reflectance of around 18% as the standard, which can obtain reasonable results. In addition, it is necessary to clearly provide the material category and surface reflectance information of the standard when calibrating the scanner according to the measurement standard.

14.
Biomed Opt Express ; 13(8): 4278-4297, 2022 Aug 01.
Artigo em Inglês | MEDLINE | ID: mdl-36032578

RESUMO

As the core task of the reconstruction in conventional ptychography (CP) and Fourier ptychographic microscopy (FPM), the meticulous design of ptychographical iterative engine (PIE) largely affects the performance of reconstruction algorithms. Compared to traditional PIE algorithms, the paradigm of combining with machine learning to cross a local optimum has recently achieved significant progress. Nevertheless, existing designed engines still suffer drawbacks such as excessive hyper-parameters, heavy tuning work and lack of compatibility, which greatly limit their practical applications. In this work, we present a complete set of alternative schemes comprised of a kind of new perspective, a uniform design template, and a fusion framework, to naturally integrate Fourier ptychography (FP) with machine learning concepts. The new perspective, Dynamic Physics, is taken as the preferred tool to analyze a path (algorithm) at the physical level; the uniform design template, T-FP, clarifies the physical significance and optimization part in a path; the fusion framework follows two workable guidelines that are specially designed to keep convergence and make later localized modification for a new path, and further establishes a link between FP iterations and the gradient update in machine learning. Our scheme is compatible with both traditional FP paths and machine learning concepts. By combining ideas in both fields, we offer two design examples, MaFP and AdamFP. Results for both simulations and experiments show that designed algorithms following our scheme obtain better, faster (converge at the early stage after a few iterations) and more stable recovery with only minimal tuning hyper-parameters, demonstrating the effectiveness and superiority of our scheme.

15.
Sensors (Basel) ; 22(5)2022 Feb 22.
Artigo em Inglês | MEDLINE | ID: mdl-35270867

RESUMO

Due to the problem of insufficient dynamic human ear data, the Changchun University dynamic human ear (CCU-DE) database, which is a small sample human ear database, was developed in this study. The database fully considers the various complex situations and posture changes of human ear images, such as translation angle, rotation angle, illumination change, occlusion and interference, etc., making the research of dynamic human ear recognition closer to complex real-life situations, and increasing the applicability of human ear dynamic recognition. In order to test the practicability and effectiveness of the developed CCU-DE small sample database, we designed a dynamic human ear recognition system block diagram based on a deep learning model, which was pre-trained by a migration learning method. Aiming at multi-posture changes under different contrasts, translation and rotation motions, and with or without occlusion, simulation studies were conducted using the CCU-DE small sample database and different deep learning models, such as YOLOv3, YOLOv4, YOLOv5, Faster R-CNN, and SSD. The experimental results showed that the CCU-DE database can be well used for dynamic ear recognition, and it can be tested by using different deep learning models with higher test accuracy.


Assuntos
Aprendizado Profundo , Bases de Dados Factuais , Orelha , Humanos
16.
Sensors (Basel) ; 22(3)2022 Feb 06.
Artigo em Inglês | MEDLINE | ID: mdl-35161974

RESUMO

This study experimentally investigated the effects of hydrogen direct injection on combustion and the cycle-by-cycle variations in a spark ignition n-butanol engine under lean burn conditions. For this purpose, a spark ignition engine installed with a hydrogen and n-butanol dual fuel injection system was specially developed. Experiments were conducted at four excess air ratios, four hydrogen fractions(φ(𝐻2)) and pure n-butanol. Engine speed and intake manifold absolute pressure (MAP) were kept at 1500 r/min and 43 kPa, respectively. The results indicate that the θ0-10 and θ10-90 decreased gradually with the increase in hydrogen fraction. Additionally, the indicated mean effective pressure (IMEP), the peak cylinder pressure (Pmax) and the maximum rate of pressure rise ((dP/dφ)max) increased gradually, while their cycle-by-cycle variations decreased with the increase in hydrogen fraction. In addition, the correlation between the (dP/dφ)max and its corresponding crank angle became weak with the increase in the excess air coefficient (λ), which tends to be strongly correlated with the increase in hydrogen fraction. The coefficient of variation of the Pmax and the IMEP increased with the increase in λ, while they decreased obviously after blending in the hydrogen under lean burn conditions. Furthermore, when λ was 1.0, a 5% hydrogen fraction improved the cycle-by-cycle variations most significantly. While a larger hydrogen fraction is needed to achieve the excellent combustion characteristics under lean burn conditions, hydrogen direct injection can promote combustion process and is beneficial for enhancing stable combustion and reducing the cycle-by-cycle variations.

17.
Sensors (Basel) ; 22(3)2022 Feb 06.
Artigo em Inglês | MEDLINE | ID: mdl-35161982

RESUMO

Fourier ptychographic microscopy (FPM) is a potential imaging technique, which is used to achieve wide field-of-view (FOV), high-resolution and quantitative phase information. The LED array is used to irradiate the samples from different angles to obtain the corresponding low-resolution intensity images. However, the performance of reconstruction still suffers from noise and image data redundancy, which needs to be considered. In this paper, we present a novel Fourier ptychographic microscopy imaging reconstruction method based on a deep multi-feature transfer network, which can achieve good anti-noise performance and realize high-resolution reconstruction with reduced image data. First, in this paper, the image features are deeply extracted through transfer learning ResNet50, Xception and DenseNet121 networks, and utilize the complementarity of deep multiple features and adopt cascaded feature fusion strategy for channel merging to improve the quality of image reconstruction; then the pre-upsampling is used to reconstruct the network to improve the texture details of the high-resolution reconstructed image. We validate the performance of the reported method via both simulation and experiment. The model has good robustness to noise and blurred images. Better reconstruction results are obtained under the conditions of short time and low resolution. We hope that the end-to-end mapping method of neural network can provide a neural-network perspective to solve the FPM reconstruction.


Assuntos
Microscopia , Redes Neurais de Computação , Simulação por Computador , Luz , Projetos de Pesquisa
18.
Sensors (Basel) ; 22(3)2022 Feb 07.
Artigo em Inglês | MEDLINE | ID: mdl-35162002

RESUMO

The accurate segmentation of retinal vascular is of great significance for the diagnosis of diseases such as diabetes, hypertension, microaneurysms and arteriosclerosis. In order to segment more deep and small blood vessels and provide more information to doctors, a multi-scale joint optimization strategy for retinal vascular segmentation is presented in this paper. Firstly, the Multi-Scale Retinex (MSR) algorithm is used to improve the uneven illumination of fundus images. Then, the multi-scale Gaussian matched filtering method is used to enhance the contrast of the retinal images. Optimized by the Particle Swarm Optimization (PSO) algorithm, Otsu algorithm (OTSU) multi-threshold segmentation is utilized to segment the retinal image extracted by the multi-scale matched filtering method. Finally, the image is post-processed, including binarization, morphological operation and edge-contour removal. The test experiments are implemented on the DRIVE and STARE datasets to evaluate the effectiveness and practicability of the proposed method. Compared with other existing methods, it can be concluded that the proposed method can segment more small blood vessels while ensuring the integrity of vascular structure and has a higher performance. The proposed method has more obvious targets, a higher contrast, more plentiful detailed information, and local features. The qualitative and quantitative analysis results show that the presented method is superior to the other advanced methods.


Assuntos
Algoritmos , Vasos Retinianos , Fundo de Olho , Processamento de Imagem Assistida por Computador , Distribuição Normal , Vasos Retinianos/diagnóstico por imagem
19.
IEEE Trans Med Imaging ; 41(6): 1596-1607, 2022 06.
Artigo em Inglês | MEDLINE | ID: mdl-35041595

RESUMO

Automatic diabetic retinopathy (DR) lesions segmentation makes great sense of assisting ophthalmologists in diagnosis. Although many researches have been conducted on this task, most prior works paid too much attention to the designs of networks instead of considering the pathological association for lesions. Through investigating the pathogenic causes of DR lesions in advance, we found that certain lesions are closed to specific vessels and present relative patterns to each other. Motivated by the observation, we propose a relation transformer block (RTB) to incorporate attention mechanisms at two main levels: a self-attention transformer exploits global dependencies among lesion features, while a cross-attention transformer allows interactions between lesion and vessel features by integrating valuable vascular information to alleviate ambiguity in lesion detection caused by complex fundus structures. In addition, to capture the small lesion patterns first, we propose a global transformer block (GTB) which preserves detailed information in deep network. By integrating the above blocks of dual-branches, our network segments the four kinds of lesions simultaneously. Comprehensive experiments on IDRiD and DDR datasets well demonstrate the superiority of our approach, which achieves competitive performance compared to state-of-the-arts.


Assuntos
Diabetes Mellitus , Retinopatia Diabética , Retinopatia Diabética/diagnóstico por imagem , Fundo de Olho , Humanos
20.
IEEE Trans Cybern ; 52(8): 7527-7540, 2022 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-33417585

RESUMO

Visual object tracking with semantic deep features has recently attracted much attention in computer vision. Especially, Siamese trackers, which aim to learn a decision making-based similarity evaluation, are widely utilized in the tracking community. However, the online updating of the Siamese fashion is still a tricky issue due to the limitation, which is a tradeoff between model adaption and degradation. To address such an issue, in this article, we propose a novel attentional transfer learning-based Siamese network (SiamATL), which fully exploits the previous knowledge to inspire the current tracker learning in the decision-making module. First, we explicitly model the template and surroundings by using an attentional online update strategy to avoid template pollution. Then, we introduce an instance-transfer discriminative correlation filter (ITDCF) to enhance the distinguishing ability of the tracker. Finally, we suggest a mutual compensation mechanism that integrates cross-correlation matching and ITDCF detection into the decision-making subnetwork to achieve online tracking. Comprehensive experiments demonstrate that our approach outperforms state-of-the-art tracking algorithms on multiple large-scale tracking datasets.


Assuntos
Algoritmos , Processamento de Imagem Assistida por Computador , Atenção , Aprendizagem , Aprendizado de Máquina
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...