Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 14 de 14
Filtrar
1.
IEEE Trans Image Process ; 33: 3590-3605, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38819968

RESUMO

In this paper, we propose a novel framework for multi-person pose estimation and tracking on challenging scenarios. In view of occlusions and motion blurs which hinder the performance of pose tracking, we proposed to model humans as graphs and perform pose estimation and tracking by concentrating on the visible parts of human bodies which are informative about complete skeletons under incomplete observations. Specifically, the proposed framework involves three parts: (i) A Sparse Key-point Flow Estimating Module (SKFEM) and a Hierarchical Graph Distance Minimizing Module (HGMM) for estimating pixel-level and human-level motion, respectively; (ii) Pixel-level appearance consistency and human-level structural consistency are combined in measuring the visibility scores of body joints. The scores guide the pose estimator to predict complete skeletons by observing high-visibility parts, under the assumption that visible and invisible parts are inherently correlated in human part graphs. The pose estimator is iteratively fine-tuned to achieve this capability; (iii) Multiple historical frames are combined to benefit tracking which is implemented using HGMM. The proposed approach not only achieves state-of-the-art performance on PoseTrack datasets but also contributes to significant improvements in other tasks such as human-related anomaly detection.

2.
Comput Biol Med ; 139: 104989, 2021 12.
Artigo em Inglês | MEDLINE | ID: mdl-34739969

RESUMO

Insomnia is one of the most common sleep disorders which can dramatically impair life quality and negatively affect an individual's physical and mental health. Recently, various deep learning based methods have been proposed for automatic and objective insomnia detection, owing to the great success of deep learning techniques. However, due to the scarcity of public insomnia data, a deep learning model trained on a dataset with a small number of insomnia subjects may compromise the generalization capacity of the model and eventually limit the performance of insomnia detection. Meanwhile, there have been a number of public EEG datasets collected from a large number of healthy subjects for various sleep research tasks such as sleep staging. Therefore, to utilize such abundant EEG datasets for addressing the data scarcity issue in insomnia detection, in this paper we propose a domain adaptation based model to better extract insomnia related features of the target domain by leveraging stage annotations from the source domain. For each domain, two pairs of common encoder and private encoder are firstly trained to extract sleep related features and sleep irrelevant features, respectively. In order to further discriminate source domain and target domain, a domain classifier is introduced. Then, the common encoder of the target domain will be used together with the Long Short Term Memory (LSTM) network for insomnia detection. To the best of our knowledge, this is the first deep learning based domain adaptation model using single channel raw EEG signals to detect insomnia at subject level. We use the Montreal Archive of Sleep Studies (MASS) dataset which contains only healthy subjects as source domain and two datasets which contain both healthy and insomnia subjects as target domain to validate our model's generalizability. Experimental results on the two target domain datasets (a public one and an in-house one) demonstrate that our model generalizes well on two target domain datasets with different sampling rates. In particular, our proposed method is able to improve insomnia detection performance from 50.0% to 90.9% and 66.7%-79.2% in terms of accuracy on the two target domain datasets, respectively.


Assuntos
Distúrbios do Início e da Manutenção do Sono , Eletroencefalografia , Humanos , Polissonografia , Sono , Fases do Sono
3.
IEEE J Biomed Health Inform ; 24(10): 2833-2843, 2020 10.
Artigo em Inglês | MEDLINE | ID: mdl-32149700

RESUMO

Sleep staging is to score the sleep state of a subject into different sleep stages such as Wake and Rapid Eye Movement (REM). It plays an indispensable role in the diagnosis and treatment of sleep disorders. As manual sleep staging through well-trained sleep experts is time consuming, tedious, and subjective, many automatic methods have been developed for accurate, efficient, and objective sleep staging. Recently, deep learning based methods have been successfully proposed for electroencephalogram (EEG) based sleep staging with promising results. However, most of these methods directly take EEG raw signals as input of convolutional neural networks (CNNs) without considering the domain knowledge of EEG staging. Apart from that, to capture temporal information, most of the existing methods utilize recurrent neural networks such as LSTM (Long Short Term Memory) which are not effective for modelling global temporal context and difficult to train. Therefore, inspired by the clinical guidelines of sleep staging such as AASM (American Academy of Sleep Medicine) rules where different stages are generally characterized by EEG waveforms of various frequencies, we propose a multi-scale deep architecture by decomposing an EEG signal into different frequency bands as input to CNNs. To model global temporal context, we utilize the multi-head self-attention module of the transformer model to not only improve performance, but also shorten the training time. In addition, we choose residual based architecture which makes training end-to-end. Experimental results on two widely used sleep staging datasets, Montreal Archive of Sleep Studies (MASS) and sleep-EDF datasets, demonstrate the effectiveness and significant efficiency (up to 12 times less training time) of our proposed method over the state-of-the-art.


Assuntos
Eletroencefalografia/métodos , Redes Neurais de Computação , Processamento de Sinais Assistido por Computador , Fases do Sono/fisiologia , Adulto , Idoso , Bases de Dados Factuais , Aprendizado Profundo , Feminino , Humanos , Masculino , Pessoa de Meia-Idade , Adulto Jovem
4.
IEEE Trans Neural Syst Rehabil Eng ; 27(5): 963-973, 2019 05.
Artigo em Inglês | MEDLINE | ID: mdl-30998471

RESUMO

Developmental coordination disorder (DCD) is a type of motor learning difficulty that affects five to six percent of school-aged children, which may have a negative impact on the life of the sufferers. Timely and objective diagnosis of DCD are important for the success of the intervention. The present evaluation methods of DCD rely heavily on the observational analysis of occupational therapists and physiotherapists, who score the performance when children conduct some designed tasks. However, these methods are expensive, subjective, and are not easy to expand to a larger population. A fine motor evaluation system (FMES) is proposed with two views of cameras to record children's performance, when they carry out three fine motor tasks. Automated algorithms are developed to perform automated scoring of fine motor skill. The automated algorithms include task localization and individual task evaluation. The purpose of task localization is to detect each task and extract segments belonging to each task from the original video that includes multiple segments of different tasks. A convolutional neural network with temporal filtering is used to do frame-wise classification, and a boundary localization algorithm is proposed to localize each task segment. For individual task evaluation, the extracted video segments of task 1 and task 2 are evaluated based on the proposed feature extraction and time positioning algorithm, and the paper drawings of task 3 are evaluated based on image processing. The proposed methods are validated on a diverse population of children with or without DCD by comparing automated scoring with manual scoring from a professional evaluator. The experimental results suggest that the proposed methods can effectively achieve fine motor evaluation for DCD assessment. Besides, our system is a low-cost solution, and the evaluation methods developed are automated, objective, and can be suited for large population evaluation and analysis.


Assuntos
Transtornos das Habilidades Motoras/diagnóstico , Destreza Motora , Desempenho Psicomotor/fisiologia , Algoritmos , Criança , Feminino , Humanos , Masculino , Transtornos das Habilidades Motoras/fisiopatologia , Redes Neurais de Computação , Reprodutibilidade dos Testes , Gravação em Vídeo
5.
Artigo em Inglês | MEDLINE | ID: mdl-30571629

RESUMO

In this paper, a deep learning model with an optimal capacity is proposed to improve the performance of person part segmentation. Previous efforts in optimizing the capacity of a CNN model suffer from a lack of large datasets as well as the over-dependence on a single-modality CNN which is not effective in learning. We make several efforts in addressing these problems. Firstly, other datasets are utilized to train a CNN module for pre-processing image data and a segmentation performance improvement is achieved without a time-consuming annotation process. Secondly, we propose a novel way of integrating two complementary modules to enrich the feature representations for more reliable inferences. Thirdly, the factors to determine the capacity of a CNN model are studied and two novel methods are proposed to adjust (optimize) the capacity of a CNN to match it to the complexity of a task. The over-fitting and under-fitting problems are eased by using our methods. Experimental results show that our model outperforms the state-of-the-art deep learning models with a better generalization ability and a lower computational complexity.

6.
J Healthc Eng ; 2018: 7692198, 2018.
Artigo em Inglês | MEDLINE | ID: mdl-29854365

RESUMO

Strabismus is one of the most common vision diseases that would cause amblyopia and even permanent vision loss. Timely diagnosis is crucial for well treating strabismus. In contrast to manual diagnosis, automatic recognition can significantly reduce labor cost and increase diagnosis efficiency. In this paper, we propose to recognize strabismus using eye-tracking data and convolutional neural networks. In particular, an eye tracker is first exploited to record a subject's eye movements. A gaze deviation (GaDe) image is then proposed to characterize the subject's eye-tracking data according to the accuracies of gaze points. The GaDe image is fed to a convolutional neural network (CNN) that has been trained on a large image database called ImageNet. The outputs of the full connection layers of the CNN are used as the GaDe image's features for strabismus recognition. A dataset containing eye-tracking data of both strabismic subjects and normal subjects is established for experiments. Experimental results demonstrate that the natural image features can be well transferred to represent eye-tracking data, and strabismus can be effectively recognized by our proposed method.


Assuntos
Ambliopia/diagnóstico por imagem , Diagnóstico por Computador/métodos , Movimentos Oculares , Redes Neurais de Computação , Estrabismo/diagnóstico por imagem , Adulto , Algoritmos , Simulação por Computador , Bases de Dados Factuais , Feminino , Humanos , Masculino , Pessoa de Meia-Idade , Reconhecimento Automatizado de Padrão , Interface Usuário-Computador
7.
Healthc Technol Lett ; 5(1): 1-6, 2018 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-29515809

RESUMO

Strabismus is one of the most common vision disorders in preschool children. It can cause amblyopia and even permanent vision loss. In addition to a vision problem, strabismus brings to both children and adults serious negative impacts in their daily life, education, employment etc. Timely diagnosis of strabismus is thus crucial. However, traditional diagnosis methods conducted by ophthalmologists rely significantly on their experiences, making the diagnosis results subjective. It is also inconvenient for those methods being used for strabismus examination in large communities such as schools. In light of that, in this Letter, the authors develop an objective, digital and automatic system based on eye-tracking technique for diagnosing strabismus. The system exploits eye-tracking technique to acquire a person's eye gaze data while he or she is looking at some targets. A group of features are proposed to characterise the gaze data. The person's strabismus condition can be diagnosed according to the features. A strabismus gaze dataset is built using the system. Experimental results on the dataset demonstrate the effectiveness of the proposed system for strabismus diagnosis.

8.
IEEE Trans Cybern ; 44(8): 1249-58, 2014 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-24919206

RESUMO

Saliency models have been developed and widely demonstrated to benefit applications in computer vision and image understanding. In most of existing models, saliency is evaluated within an individual image. That is, saliency value of an item (object/region/pixel) represents the conspicuity of it as compared with the remaining items in the same image. We call this saliency as absolute saliency, which is uncomparable among images. However, saliency should be determined in the context of multiple images for some visual inspection tasks. For example, in yarn surface evaluation, saliency of a yarn image should be measured with regard to a set of graded standard images. We call this saliency the relative saliency, which is comparable among images. In this paper, a study of visual attention model for comparison of multiple images is explored, and a relative saliency model of multiple images is proposed based on a combination of bottom-up and top-down mechanisms, to enable relative saliency evaluation for the cases where other image contents are involved. To fully characterize the differences among multiple images, a structural feature extraction strategy is proposed, where two levels of feature (high-level, low-level) and three types of feature (global, local-local, local-global) are extracted. Mapping functions between features and saliency values are constructed and their outputs reflect relative saliency for multiimage contents instead of single image content. The performance of the proposed relative saliency model is well demonstrated in a yarn surface evaluation. Furthermore, the eye tracking technique is employed to verify the proposed concept of relative saliency for multiple images.

9.
Int J Neural Syst ; 18(3): 195-205, 2008 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-18595149

RESUMO

Image classification is a challenging problem in organizing a large image database. However, an effective method for such an objective is still under investigation. A method based on wavelet analysis to extract features for image classification is presented in this paper. After an image is decomposed by wavelet, the statistics of its features can be obtained by the distribution of histograms of wavelet coefficients, which are respectively projected onto two orthogonal axes, i.e., x and y directions. Therefore, the nodes of tree representation of images can be represented by the distribution. The high level features are described in low dimensional space including 16 attributes so that the computational complexity is significantly decreased. 2,800 images derived from seven categories are used in experiments. Half of the images were used for training neural network and the other images used for testing. The features extracted by wavelet analysis and the conventional features are used in the experiments to prove the efficacy of the proposed method. The classification rate on the training data set with wavelet analysis is up to 91%, and the classification rate on the testing data set reaches 89%. Experimental results show that our proposed approach for image classification is more effective.


Assuntos
Biometria/métodos , Interpretação de Imagem Assistida por Computador , Armazenamento e Recuperação da Informação/métodos , Redes Neurais de Computação , Algoritmos , Reconhecimento Automatizado de Padrão
10.
IEEE Trans Neural Netw ; 16(3): 721-32, 2005 May.
Artigo em Inglês | MEDLINE | ID: mdl-15940999

RESUMO

This paper proposes new modified constrained learning neural root finders (NRFs) of polynomial constructed by backpropagation network (BPN). The technique is based on the relationships between the roots and the coefficients of polynomial as well as between the root moments and the coefficients of the polynomial. We investigated different resulting constrained learning algorithms (CLAs) based on the variants of the error cost functions (ECFs) in the constrained BPN and derived a new modified CLA (MCLA), and found that the computational complexities of the CLA and the MCLA based on the root-moment method (RMM) are the order of polynomial, and that the MCLA is simpler than the CLA. Further, we also discussed the effects of the different parameters with the CLA and the MCLA on the NRFs. In particular, considering the coefficients of the polynomials involved in practice to possibly be perturbed by noisy sources, thus, we also evaluated and discussed the effects of noises on the two NRFs. Finally, to demonstrate the advantage of our neural approaches over the nonneural ones, a series of simulating experiments are conducted.


Assuntos
Algoritmos , Redes Neurais de Computação , Análise Numérica Assistida por Computador , Processamento de Sinais Assistido por Computador , Simulação por Computador , Retroalimentação , Processos Estocásticos
11.
J Zhejiang Univ Sci ; 5(7): 764-72, 2004 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-15495304

RESUMO

Flower image retrieval is a very important step for computer-aided plant species recognition. In this paper, we propose an efficient segmentation method based on color clustering and domain knowledge to extract flower regions from flower images. For flower retrieval, we use the color histogram of a flower region to characterize the color features of flower and two shape-based features sets, Centroid-Contour Distance (CCD) and Angle Code Histogram (ACH), to characterize the shape features of a flower contour. Experimental results showed that our flower region extraction method based on color clustering and domain knowledge can produce accurate flower regions. Flower retrieval results on a database of 885 flower images collected from 14 plant species showed that our Region-of-Interest (ROI) based retrieval approach using both color and shape features can perform better than a method based on the global color histogram proposed by Swain and Ballard (1991) and a method based on domain knowledge-driven segmentation and color names proposed by Das et al.(1999).


Assuntos
Algoritmos , Colorimetria/métodos , Flores/anatomia & histologia , Flores/classificação , Interpretação de Imagem Assistida por Computador/métodos , Reconhecimento Automatizado de Padrão/métodos , Fotografação/métodos , Inteligência Artificial , Análise por Conglomerados , Cor , Aumento da Imagem/métodos , Armazenamento e Recuperação da Informação/métodos , Análise Numérica Assistida por Computador , Reprodutibilidade dos Testes , Sensibilidade e Especificidade , Processamento de Sinais Assistido por Computador
12.
IEEE Trans Med Imaging ; 23(4): 426-32, 2004 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-15084068

RESUMO

It is well known that 40%-50% of hepatocellular carcinoma (HCC) do not show increased 18F-fluorodeoxyglucose (FDG) uptake. Recent research studies have demonstrated that 11C-acetate may be a complementary tracer to FDG in positron emission tomography (PET) imaging of HCC in the liver. Quantitative dynamic modeling is, therefore, conducted to evaluate the kinetic characteristics of this tracer in HCC and nontumor liver tissue. A three-compartment model consisting of four parameters with dual inputs is proposed and compared with that of five parameters. Twelve regions of dynamic datasets of the liver extracted from six patients are used to test the models. Estimation of the adequacy of these models is based on Akaike Information Criteria (AIC) and Schwarz Criteria (SC) by statistical study. The forward clearance K = K1 * k3/(k2 + k3) is estimated and defined as a new parameter called the local hepatic metabolic rate-constant of acetate (LHMRAct) using both the weighted nonlinear least squares (NLS) and the linear Patlak methods. Preliminary results show that the LHMRAct of the HCC is significantly higher than that of the nontumor liver tissue. These model parameters provide quantitative evidence and understanding on the kinetic basis of C-acetate for its potential role in the imaging of HCC using PET.


Assuntos
Acetatos/farmacocinética , Carbono/farmacocinética , Carcinoma Hepatocelular/diagnóstico por imagem , Carcinoma Hepatocelular/metabolismo , Interpretação de Imagem Assistida por Computador/métodos , Fígado/diagnóstico por imagem , Fígado/metabolismo , Modelos Biológicos , Algoritmos , Simulação por Computador , Humanos , Testes de Função Hepática/métodos , Neoplasias Hepáticas/diagnóstico por imagem , Neoplasias Hepáticas/metabolismo , Técnica de Diluição de Radioisótopos , Compostos Radiofarmacêuticos/farmacocinética , Reprodutibilidade dos Testes , Sensibilidade e Especificidade , Tomografia Computadorizada de Emissão/métodos
13.
IEEE Trans Neural Netw ; 14(4): 781-93, 2003.
Artigo em Inglês | MEDLINE | ID: mdl-18238059

RESUMO

Many researchers have explored the use of neural-network representations for the adaptive processing of data structures. One of the most popular learning formulations of data structure processing is backpropagation through structure (BPTS). The BPTS algorithm has been successful applied to a number of learning tasks that involve structural patterns such as logo and natural scene classification. The main limitations of the BPTS algorithm are attributed to slow convergence speed and the long-term dependency problem for the adaptive processing of data structures. In this paper, an improved algorithm is proposed to solve these problems. The idea of this algorithm is to optimize the free learning parameters of the neural network in the node representation by using least-squares-based optimization methods in a layer-by-layer fashion. Not only can fast convergence speed be achieved, but the long-term dependency problem can also be overcome since the vanishing of gradient information is avoided when our approach is applied to very deep tree structures.

14.
IEEE Trans Image Process ; 11(6): 636-43, 2002.
Artigo em Inglês | MEDLINE | ID: mdl-18244662

RESUMO

Image quality assessment is an important issue addressed in various image processing applications such as image/video compression and image reconstruction. The peak signal-to-noise ratio (PSNR) with the L(2)-metric is commonly used in objective image quality assessment. However, the measure does not agree very well with the human visual perception in many cases. A fuzzy image metric (FIM) is defined based on Sugeno's (1977) fuzzy integral. This new objective image metric, which is to some extent a proper evaluation from the viewpoint of the judgment procedure, is closely approximates the subjective mean opinion score (MOS) with a correlation coefficient of about 0.94, as compared to 0.82 obtained using the PSNR. Compared to the L(2)-metric, we demonstrate that a better performance can be achieved in fractal coding by using the proposed FIM.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...