Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 21
Filtrar
1.
Comput Biol Med ; 174: 108443, 2024 May.
Artigo em Inglês | MEDLINE | ID: mdl-38608328

RESUMO

Retinal vessel segmentation based on deep learning is an important auxiliary method for assisting clinical doctors in diagnosing retinal diseases. However, existing methods often produce mis-segmentation when dealing with low contrast images and thin blood vessels, which affects the continuity and integrity of the vessel skeleton. In addition, existing deep learning methods tend to lose a lot of detailed information during training, which affects the accuracy of segmentation. To address these issues, we propose a novel dual-decoder based Cross-patch Feature Interactive Net with Edge Refinement (CFI-Net) for end-to-end retinal vessel segmentation. In the encoder part, a joint refinement down-sampling method (JRDM) is proposed to compress feature information in the process of reducing image size, so as to reduce the loss of thin vessels and vessel edge information during the encoding process. In the decoder part, we adopt a dual-path model based on edge detection, and propose a Cross-patch Interactive Attention Mechanism (CIAM) in the main path to enhancing multi-scale spatial channel features and transferring cross-spatial information. Consequently, it improve the network's ability to segment complete and continuous vessel skeletons, reducing vessel segmentation fractures. Finally, the Adaptive Spatial Context Guide Method (ASCGM) is proposed to fuse the prediction results of the two decoder paths, which enhances segmentation details while removing part of the background noise. We evaluated our model on two retinal image datasets and one coronary angiography dataset, achieving outstanding performance in segmentation comprehensive assessment metrics such as AUC and CAL. The experimental results showed that the proposed CFI-Net has superior segmentation performance compared with other existing methods, especially for thin vessels and vessel edges. The code is available at https://github.com/kita0420/CFI-Net.


Assuntos
Aprendizado Profundo , Vasos Retinianos , Vasos Retinianos/diagnóstico por imagem , Humanos , Processamento de Imagem Assistida por Computador/métodos , Algoritmos
2.
Sensors (Basel) ; 23(18)2023 Sep 14.
Artigo em Inglês | MEDLINE | ID: mdl-37765937

RESUMO

Video super-resolution aims to generate high-resolution frames from low-resolution counterparts. It can be regarded as a specialized application of image super-resolution, serving various purposes, such as video display and surveillance. This paper proposes a novel method for real-time video super-resolution. It effectively exploits spatial information by utilizing the capabilities of an image super-resolution model and leverages the temporal information inherent in videos. Specifically, the method incorporates a pre-trained image super-resolution network as its foundational framework, allowing it to leverage existing expertise for super-resolution. A fast temporal information aggregation module is presented to further aggregate temporal cues across frames. By using deformable convolution to align features of neighboring frames, this module takes advantage of inter-frame dependency. In addition, it employs a hierarchical fast spatial offset feature extraction and a channel attention-based temporal fusion. A redundancy-aware inference algorithm is developed to reduce computational redundancy by reusing intermediate features, achieving real-time inferring speed. Extensive experiments on several benchmarks demonstrate that the proposed method can reconstruct satisfactory results with strong quantitative performance and visual qualities. The real-time inferring ability makes it suitable for real-world deployment.

3.
Sensors (Basel) ; 23(11)2023 May 24.
Artigo em Inglês | MEDLINE | ID: mdl-37299757

RESUMO

The quality of videos varies due to the different capabilities of sensors. Video super-resolution (VSR) is a technology that improves the quality of captured video. However, the development of a VSR model is very costly. In this paper, we present a novel approach for adapting single-image super-resolution (SISR) models to the VSR task. To achieve this, we first summarize a common architecture of SISR models and perform a formal analysis of adaptation. Then, we propose an adaptation method that incorporates a plug-and-play temporal feature extraction module into existing SISR models. The proposed temporal feature extraction module consists of three submodules: offset estimation, spatial aggregation, and temporal aggregation. In the spatial aggregation submodule, the features obtained from the SISR model are aligned to the center frame based on the offset estimation results. The aligned features are fused in the temporal aggregation submodule. Finally, the fused temporal feature is fed to the SISR model for reconstruction. To evaluate the effectiveness of our method, we adapt five representative SISR models and evaluate these models on two popular benchmarks. The experiment results show the proposed method is effective on different SISR models. In particular, on the Vid4 benchmark, the VSR-adapted models achieve at least 1.26 dB and 0.067 improvement over the original SISR models in terms of PSNR and SSIM metrics, respectively. Additionally, these VSR-adapted models achieve better performance than the state-of-the-art VSR models.


Assuntos
Aclimatação , Benchmarking , Tecnologia
4.
Med Image Anal ; 88: 102867, 2023 08.
Artigo em Inglês | MEDLINE | ID: mdl-37348167

RESUMO

High throughput nuclear segmentation and classification of whole slide images (WSIs) is crucial to biological analysis, clinical diagnosis and precision medicine. With the advances of CNN algorithms and the continuously growing datasets, considerable progress has been made in nuclear segmentation and classification. However, few works consider how to reasonably deal with nuclear heterogeneity in the following two aspects: imbalanced data distribution and diversified morphology characteristics. The minority classes might be dominated by the majority classes due to the imbalanced data distribution and the diversified morphology characteristics may lead to fragile segmentation results. In this study, a cost-Sensitive MultI-task LEarning (SMILE) framework is conducted to tackle the data heterogeneity problem. Based on the most popular multi-task learning backbone in nuclei segmentation and classification, we propose a multi-task correlation attention (MTCA) to perform feature interaction of multiple high relevant tasks to learn better feature representation. A cost-sensitive learning strategy is proposed to solve the imbalanced data distribution by increasing the penalization for the error classification of the minority classes. Furthermore, we propose a novel post-processing step based on the coarse-to-fine marker-controlled watershed scheme to alleviate fragile segmentation when nuclei are with large size and unclear contour. Extensive experiments show that the proposed method achieves state-of-the-art performances on CoNSeP and MoNuSAC 2020 datasets. The code is available at: https://github.com/panxipeng/nuclear_segandcls.


Assuntos
Algoritmos , Aprendizagem , Humanos , Núcleo Celular , Processamento de Imagem Assistida por Computador , Medicina de Precisão
5.
Transl Vis Sci Technol ; 12(4): 8, 2023 04 03.
Artigo em Inglês | MEDLINE | ID: mdl-37026984

RESUMO

Purpose: Accurate identification of corneal layers with in vivo confocal microscopy (IVCM) is essential for the correct assessment of corneal lesions. This project aims to obtain a reliable automated identification of corneal layers from IVCM images. Methods: A total of 7957 IVCM images were included for model training and testing. Scanning depth information and pixel information of IVCM images were used to build the classification system. Firstly, two base classifiers based on convolutional neural networks and K-nearest neighbors were constructed. Second, two hybrid strategies, namely weighted voting method and light gradient boosting machine (LightGBM) algorithm were used to fuse the results from the two base classifiers and obtain the final classification. Finally, the confidence of prediction results was stratified to help find out model errors. Results: Both two hybrid systems outperformed the two base classifiers. The weighted area under the curve, weighted precision, weighted recall, and weighted F1 score were 0.9841, 0.9096, 0.9145, and 0.9111 for weighted voting hybrid system, and were 0.9794, 0.9039, 0.9055, and 0.9034 for the light gradient boosting machine stacking hybrid system, respectively. More than one-half of the misclassified samples were found using the confidence stratification method. Conclusions: The proposed hybrid approach could effectively integrate the scanning depth and pixel information of IVCM images, allowing for the accurate identification of corneal layers for grossly normal IVCM images. The confidence stratification approach was useful to find out misclassification of the system. Translational Relevance: The proposed hybrid approach lays important groundwork for the automatic identification of the corneal layer for IVCM images.


Assuntos
Córnea , Transtornos da Visão , Humanos , Córnea/diagnóstico por imagem , Transtornos da Visão/patologia , Algoritmos , Microscopia Confocal/métodos , Redes Neurais de Computação
6.
Sensors (Basel) ; 23(4)2023 Feb 04.
Artigo em Inglês | MEDLINE | ID: mdl-36850367

RESUMO

The performance of Chinese-named entity recognition (NER) has improved via word enhancement or new frameworks that incorporate various types of external data. However, for Chinese NER, syntactic composition (in sentence level) and inner regularity (in character-level) have rarely been studied. Chinese characters are highly sensitive to sentential syntactic data. The same Chinese character sequence can be decomposed into different combinations of words according to how they are used and placed in the context. In addition, the same type of entities usually have the same naming rules due to the specificity of the Chinese language structure. This paper presents a Kcr-FLAT to improve the performance of Chinese NER with enhanced semantic information. Specifically, we first extract different types of syntactic data, functionalize the syntactic information by a key-value memory network (KVMN), and fuse them by attention mechanism. Then the syntactic information and lexical information are integrated by a cross-transformer. Finally, we use an inner regularity perception module to capture the internal regularity of each entity for better entity type prediction. The experimental results show that with F1 scores as the evaluation index, the proposed model obtains 96.51%, 96.81%, and 70.12% accuracy rates on MSRA, resume, and Weibo datasets, respectively.

7.
Gene Expr Patterns ; 47: 119304, 2023 03.
Artigo em Inglês | MEDLINE | ID: mdl-36754104

RESUMO

Most of the existing works on fine-grained image categorization and retrieval focus on finding similar images from the same species and often give little importance to inter-species similarities. However, these similarities may carry species correlations such as the same ancestors or similar habits, which are helpful in taxonomy and understanding biological traits. In this paper, we devise a new fine-grained retrieval task that searches for similar instances from different species based on body parts. To this end, we propose a two-step strategy. In the first step, we search for visually similar parts to a query image using a deep convolutional neural network (CNN). To improve the quality of the retrieved candidates, structural cues are introduced into the CNN using a novel part-pooling layer, in which the receptive field of each part is adjusted automatically. In the second step, we re-rank the retrieved candidates to improve the species diversity. We achieve this by formulating a novel ranking function that balances between the similarity of the candidates to the queried parts, while decreasing the similarity to the query species. We provide experiments on the benchmark CUB200 dataset and Columbia Dogs dataset, and demonstrate clear benefits of our schemes.


Assuntos
Processamento de Imagem Assistida por Computador , Redes Neurais de Computação , Animais , Cães , Processamento de Imagem Assistida por Computador/métodos , Fenótipo
8.
Int Ophthalmol ; 43(7): 2203-2214, 2023 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-36595127

RESUMO

PURPOSE: Fungal keratitis is a common cause of blindness worldwide. Timely identification of the causative fungal genera is essential for clinical management. In vivo confocal microscopy (IVCM) provides useful information on pathogenic genera. This study attempted to apply deep learning (DL) to establish an automated method to identify pathogenic fungal genera using IVCM images. METHODS: Deep learning networks were trained, validated, and tested using a data set of 3364 IVCM images that collected from 100 eyes of 100 patients with culture-proven filamentous fungal keratitis. Two transfer learning approaches were investigated: one was a combined framework that extracted features by a DL network and adopted decision tree (DT) as a classifier; another was a complete supervised DL model which used DL-based fully connected layers to implement the classification. RESULTS: The DL classifier model revealed better performance compared with the DT classifier model in an independent testing set. The DL classifier model showed an area under the receiver operating characteristic curves (AUC) of 0.887 with an accuracy of 0.817, sensitivity of 0.791, specificity of 0.831, G-mean of 0.811, and F1 score of 0.749 in identifying Fusarium, and achieved an AUC of 0.827 with an accuracy of 0.757, sensitivity of 0.756, specificity of 0.759, G-mean of 0.757, and F1 score of 0.716 in identifying Aspergillus. CONCLUSION: The DL model can classify Fusarium and Aspergillus by learning effective features in IVCM images automatically. The automated IVCM image analysis suggests a noninvasive identification of Fusarium and Aspergillus with clear potential application in early diagnosis and management of fungal keratitis.


Assuntos
Úlcera da Córnea , Infecções Oculares Fúngicas , Ceratite , Humanos , Inteligência Artificial , Úlcera da Córnea/diagnóstico , Ceratite/diagnóstico , Ceratite/microbiologia , Fungos , Infecções Oculares Fúngicas/diagnóstico , Infecções Oculares Fúngicas/microbiologia , Microscopia Confocal/métodos
9.
Sensors (Basel) ; 22(21)2022 Oct 27.
Artigo em Inglês | MEDLINE | ID: mdl-36365942

RESUMO

Low-illumination images exhibit low brightness, blurry details, and color casts, which present us an unnatural visual experience and further have a negative effect on other visual applications. Data-driven approaches show tremendous potential for lighting up the image brightness while preserving its visual naturalness. However, these methods introduce hand-crafted holes and noise enlargement or over/under enhancement and color deviation. For mitigating these challenging issues, this paper presents a frequency division and multiscale learning network named FDMLNet, including two subnets, DetNet and StruNet. This design first applies the guided filter to separate the high and low frequencies of authentic images, then DetNet and StruNet are, respectively, developed to process them, to fully explore their information at different frequencies. In StruNet, a feasible feature extraction module (FFEM), grouped by multiscale learning block (MSL) and a dual-branch channel attention mechanism (DCAM), is injected to promote its multiscale representation ability. In addition, three FFEMs are connected in a new dense connectivity meant to utilize multilevel features. Extensive quantitative and qualitative experiments on public benchmarks demonstrate that our FDMLNet outperforms state-of-the-art approaches benefiting from its stronger multiscale feature expression and extraction ability.


Assuntos
Algoritmos , Aumento da Imagem , Aumento da Imagem/métodos
10.
Med Image Anal ; 80: 102481, 2022 08.
Artigo em Inglês | MEDLINE | ID: mdl-35653901

RESUMO

Cells/nuclei deliver massive information of microenvironment. An automatic nuclei segmentation approach can reduce pathologists' workload and allow precise of the microenvironment for biological and clinical researches. Existing deep learning models have achieved outstanding performance under the supervision of a large amount of labeled data. However, when data from the unseen domain comes, we still have to prepare a certain degree of manual annotations for training for each domain. Unfortunately, obtaining histopathological annotations is extremely difficult. It is high expertise-dependent and time-consuming. In this paper, we attempt to build a generalized nuclei segmentation model with less data dependency and more generalizability. To this end, we propose a meta multi-task learning (Meta-MTL) model for nuclei segmentation which requires fewer training samples. A model-agnostic meta-learning is applied as the outer optimization algorithm for the segmentation model. We introduce a contour-aware multi-task learning model as the inner model. A feature fusion and interaction block (FFIB) is proposed to allow feature communication across both tasks. Extensive experiments prove that our proposed Meta-MTL model can improve the model generalization and obtain a comparable performance with state-of-the-art models with fewer training samples. Our model can also perform fast adaptation on the unseen domain with only a few manual annotations. Code is available at https://github.com/ChuHan89/Meta-MTL4NucleiSegmentation.


Assuntos
Processamento de Imagem Assistida por Computador , Redes Neurais de Computação , Algoritmos , Humanos
11.
Front Med (Lausanne) ; 8: 797616, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-34970572

RESUMO

Background: Artificial intelligence (AI) has great potential to detect fungal keratitis using in vivo confocal microscopy images, but its clinical value remains unclarified. A major limitation of its clinical utility is the lack of explainability and interpretability. Methods: An explainable AI (XAI) system based on Gradient-weighted Class Activation Mapping (Grad-CAM) and Guided Grad-CAM was established. In this randomized controlled trial, nine ophthalmologists (three expert ophthalmologists, three competent ophthalmologists, and three novice ophthalmologists) read images in each of the conditions: unassisted, AI-assisted, or XAI-assisted. In unassisted condition, only the original IVCM images were shown to the readers. AI assistance comprised a histogram of model prediction probability. For XAI assistance, explanatory maps were additionally shown. The accuracy, sensitivity, and specificity were calculated against an adjudicated reference standard. Moreover, the time spent was measured. Results: Both forms of algorithmic assistance increased the accuracy and sensitivity of competent and novice ophthalmologists significantly without reducing specificity. The improvement was more pronounced in XAI-assisted condition than that in AI-assisted condition. Time spent with XAI assistance was not significantly different from that without assistance. Conclusion: AI has shown great promise in improving the accuracy of ophthalmologists. The inexperienced readers are more likely to benefit from the XAI system. With better interpretability and explainability, XAI-assistance can boost ophthalmologist performance beyond what is achievable by the reader alone or with black-box AI assistance.

12.
Comput Intell Neurosci ; 2021: 5845094, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-34512743

RESUMO

In recent years, hashing learning has received increasing attention in supervised video retrieval. However, most existing supervised video hashing approaches design hash functions based on pairwise similarity or triple relationships and focus on local information, which results in low retrieval accuracy. In this work, we propose a novel supervised framework called discriminative codebook hashing (DCH) for large-scale video retrieval. The proposed DCH encourages samples within the same category to converge to the same code word and maximizes the mutual distances among different categories. Specifically, we first propose the discriminative codebook via a predefined distance among intercode words and Bernoulli distributions to handle each hash bit. Then, we use the composite Kullback-Leibler (KL) divergence to align the neighborhood structures between the high-dimensional space and the Hamming space. The proposed DCH is optimized via the gradient descent algorithm. Experimental results on three widely used video datasets verify that our proposed DCH performs better than several state-of-the-art methods.


Assuntos
Algoritmos
13.
PLoS One ; 16(6): e0252653, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-34081736

RESUMO

PURPOSE: Infiltration of activated dendritic cells and inflammatory cells in cornea represents an important marker for defining corneal inflammation. Deep transfer learning has presented a promising potential and is gaining more importance in computer assisted diagnosis. This study aimed to develop deep transfer learning models for automatic detection of activated dendritic cells and inflammatory cells using in vivo confocal microscopy images. METHODS: A total of 3453 images was used to train the models. External validation was performed on an independent test set of 558 images. A ground-truth label was assigned to each image by a panel of cornea specialists. We constructed a deep transfer learning network that consisted of a pre-trained network and an adaptation layer. In this work, five pre-trained networks were considered, namely VGG-16, ResNet-101, Inception V3, Xception, and Inception-ResNet V2. The performance of each transfer network was evaluated by calculating the area under the curve (AUC) of receiver operating characteristic, accuracy, sensitivity, specificity, and G mean. RESULTS: The best performance was achieved by Inception-ResNet V2 transfer model. In the validation set, the best transfer system achieved an AUC of 0.9646 (P<0.001) in identifying activated dendritic cells (accuracy, 0.9319; sensitivity, 0.8171; specificity, 0.9517; and G mean, 0.8872), and 0.9901 (P<0.001) in identifying inflammatory cells (accuracy, 0.9767; sensitivity, 0.9174; specificity, 0.9931; and G mean, 0.9545). CONCLUSIONS: The deep transfer learning models provide a completely automated analysis of corneal inflammatory cellular components with high accuracy. The implementation of such models would greatly benefit the management of corneal diseases and reduce workloads for ophthalmologists.


Assuntos
Córnea/diagnóstico por imagem , Aprendizado Profundo , Microscopia Confocal/métodos , Área Sob a Curva , Células Dendríticas/citologia , Células Dendríticas/imunologia , Diagnóstico por Computador , Síndromes do Olho Seco/diagnóstico , Síndromes do Olho Seco/diagnóstico por imagem , Humanos , Ceratite/diagnóstico , Ceratite/diagnóstico por imagem , Modelos Teóricos , Oftalmologistas/psicologia , Pterígio/diagnóstico , Pterígio/diagnóstico por imagem , Curva ROC , Sensibilidade e Especificidade
14.
IEEE Trans Cybern ; 51(1): 115-125, 2021 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-32092023

RESUMO

Deep convolutional neural networks (CNNs) have contributed to the significant progress of the single-image super-resolution (SISR) field. However, the majority of existing CNN-based models maintain high performance with massive parameters and exceedingly deeper structures. Moreover, several algorithms essentially have underused the low-level features, thus causing relatively low performance. In this article, we address these problems by exploring two strategies based on novel local wider residual blocks (LWRBs) to effectively extract the image features for SISR. We propose a cascading residual network (CRN) that contains several locally sharing groups (LSGs), in which the cascading mechanism not only promotes the propagation of features and the gradient but also eases the model training. Besides, we present another enhanced residual network (ERN) for image resolution enhancement. ERN employs a dual global pathway structure that incorporates nonlocal operations to catch long-distance spatial features from the the original low-resolution (LR) input. To obtain the feature representation of the input at different scales, we further introduce a multiscale block (MSB) to directly detect low-level features from the LR image. The experimental results on four benchmark datasets have demonstrated that our models outperform most of the advanced methods while still retaining a reasonable number of parameters.

15.
IEEE Trans Cybern ; 51(12): 6284-6293, 2021 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-32149665

RESUMO

In this article, a simple yet effective method, called a two-phase learning-based swarm optimizer (TPLSO), is proposed for large-scale optimization. Inspired by the cooperative learning behavior in human society, mass learning and elite learning are involved in TPLSO. In the mass learning phase, TPLSO randomly selects three particles to form a study group and then adopts a competitive mechanism to update the members of the study group. Then, we sort all of the particles in the swarm and pick out the elite particles that have better fitness values. In the elite learning phase, the elite particles learn from each other to further search for more promising areas. The theoretical analysis of TPLSO exploration and exploitation abilities is performed and compared with several popular particle swarm optimizers. Comparative experiments on two widely used large-scale benchmark datasets demonstrate that the proposed TPLSO achieves better performance on diverse large-scale problems than several state-of-the-art algorithms.


Assuntos
Algoritmos , Humanos
16.
IEEE Trans Cybern ; 51(3): 1443-1453, 2021 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-32149667

RESUMO

Recently, deep convolutional neural networks (CNNs) have been successfully applied to the single-image super-resolution (SISR) task with great improvement in terms of both peak signal-to-noise ratio (PSNR) and structural similarity (SSIM). However, most of the existing CNN-based SR models require high computing power, which considerably limits their real-world applications. In addition, most CNN-based methods rarely explore the intermediate features that are helpful for final image recovery. To address these issues, in this article, we propose a dense lightweight network, called MADNet, for stronger multiscale feature expression and feature correlation learning. Specifically, a residual multiscale module with an attention mechanism (RMAM) is developed to enhance the informative multiscale feature representation ability. Furthermore, we present a dual residual-path block (DRPB) that utilizes the hierarchical features from original low-resolution images. To take advantage of the multilevel features, dense connections are employed among blocks. The comparative results demonstrate the superior performance of our MADNet model while employing considerably fewer multiadds and parameters.

17.
IEEE Trans Cybern ; 50(4): 1498-1508, 2020 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-30507522

RESUMO

Collaborative representation is an effective way to design classifiers for many practical applications. In this paper, we propose a novel classifier, called the prior knowledge-based probabilistic collaborative representation-based classifier (PKPCRC), for visual recognition. Compared with existing classifiers which use the collaborative representation strategy, the proposed PKPCRC further includes characteristics of training samples of each class as prior knowledge. Four types of prior knowledge are developed from the perspectives of image distance and representation capacity. They adaptively accommodate the contribution of each class and result in an accurate representation to classify a query sample. Experiments and comparisons on four challenging databases demonstrate that PKPCRC outperforms several state-of-the-art classifiers.

18.
IEEE J Biomed Health Inform ; 21(5): 1338-1346, 2017 09.
Artigo em Inglês | MEDLINE | ID: mdl-27831894

RESUMO

The features used in many current medical image retrieval systems are usually low-level hand-crafted features. This limitation may adversely affect the retrieval performance. To address this problem, this paper proposes a simple yet discriminative feature, called histogram of compressed scattering coefficients (HCSC), for medical image retrieval. In the proposed work, the scattering transform, a particular variation of deep convolutional networks, is first performed to yield more abstract representations of a medical image. A projection operation is then conducted to compress the obtained scattering coefficients for efficient processing. Finally, a bag-of-words (BoW) histogram is derived from the compressed scattering coefficients as the features of the medical image. The proposed HCSC takes the advantages of both scattering transform and BoW model. Experiments on three benchmark medical computer tomography image databases demonstrate that HCSC outperforms several state-of-the-art features.


Assuntos
Interpretação de Imagem Assistida por Computador/métodos , Processamento de Imagem Assistida por Computador/métodos , Armazenamento e Recuperação da Informação/métodos , Tomografia Computadorizada por Raios X/métodos , Algoritmos , Bases de Dados Factuais , Humanos
19.
IEEE Trans Image Process ; 25(11): 5281-92, 2016 11.
Artigo em Inglês | MEDLINE | ID: mdl-27608464

RESUMO

In this paper, we develop a simple yet powerful framework called quaternion-Michelson descriptor (QMD) to extract local features for color image classification. Unlike traditional local descriptors extracted directly from the original (raw) image space, QMD is derived from the Michelson contrast law and the quaternionic representation (QR) of color images. The Michelson contrast is a stable measurement of image contents from the viewpoint of human perception, while QR is able to handle all the color information of the image holisticly and to preserve the interactions among different color channels. In this way, QMD integrates both the merits of Michelson contrast and QR. Based on the QMD framework, we further propose two novel quaternionic Michelson contrast binary pattern descriptors from different perspectives. Experiments and comparisons on different color image classification databases demonstrate that the proposed framework and descriptors outperform several state-of-the-art methods.

20.
IEEE Trans Image Process ; 25(2): 566-79, 2016 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-26672041

RESUMO

This paper proposes a local descriptor called quaternionic local ranking binary pattern (QLRBP) for color images. Different from traditional descriptors that are extracted from each color channel separately or from vector representations, QLRBP works on the quaternionic representation (QR) of the color image that encodes a color pixel using a quaternion. QLRBP is able to handle all color channels directly in the quaternionic domain and include their relations simultaneously. Applying a Clifford translation to QR of the color image, QLRBP uses a reference quaternion to rank QRs of two color pixels, and performs a local binary coding on the phase of the transformed result to generate local descriptors of the color image. Experiments demonstrate that the QLRBP outperforms several state-of-the-art methods.


Assuntos
Algoritmos , Processamento de Imagem Assistida por Computador/métodos , Cor , Bases de Dados Factuais , Face/anatomia & histologia , Humanos
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...