Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 3 de 3
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
Annu Int Conf IEEE Eng Med Biol Soc ; 2022: 1036-1040, 2022 07.
Artigo em Inglês | MEDLINE | ID: mdl-36086289

RESUMO

Automatic interpretation of cluster structure in rapidly arriving data streams is essential for timely detection of interesting events. Human activities often contain bursts of repeating patterns. In this paper, we propose a new relative of the Visual Assessment of Cluster Tendency (VAT) model, to interpret cluster evolution in streaming activity data where shapes of recurring patterns are important. Existing VAT algorithms are either suitable only for small batch data and unscalable to rapidly evolving streams, or cannot capture shape patterns. Our proposed incremental algorithm processes streaming data in chunks and identifies repeating patterns or shapelets from each chunk, creating a Dictionary-of-Shapes (DoS) that is updated on the fly. Each chunk is transformed into a lower dimensional representation based on it's distance from the shapelets in the current DoS. Then a small set of transformed chunks are sampled using an intelligent Maximin Random Sampling (MMRS) scheme, to create a scalable VAT image that is incrementally updated as the data stream progresses. Experiments on two upper limb activity datasets demonstrate that the proposed method can successfully and efficiently visualize clusters in long streams of data and can also identify anomalous movements.


Assuntos
Algoritmos , Memória , Análise por Conglomerados , Humanos
2.
IEEE Trans Cybern ; 51(12): 5979-5992, 2021 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-32203042

RESUMO

The widespread use of Internet-of-Things (IoT) technologies, smartphones, and social media services generates huge amounts of data streaming at high velocity. Automatic interpretation of these rapidly arriving data streams is required for the timely detection of interesting events that usually emerge in the form of clusters. This article proposes a new relative of the visual assessment of the cluster tendency (VAT) model, which produces a record of structural evolution in the data stream by building a cluster heat map of the entire processing history in the stream. The existing VAT-based algorithms for streaming data, called inc-VAT/inc-iVAT and dec-VAT/dec-iVAT, are not suitable for high-velocity and high-volume streaming data because of high memory requirements and slower processing speed as the accumulated data increases. The scalable iVAT (siVAT) algorithm can handle big batch data, but for streaming data, it needs to be (re)applied everytime a new datapoint arrives, which is not feasible due to the associated computation complexities. To address this problem, we propose an incremental siVAT algorithm, called inc-siVAT, which deals with the streaming data in chunks. It first extracts a small size smart sample using an intelligent sampling scheme, called maximin random sampling (MMRS), then incrementally updates the smart sample points on the fly, using our novel incremental MMRS (inc-MMRS) algorithm, to reflect changes in the data stream after each chunk is processed, and finally, produces an incrementally built iVAT image of the updated smart sample, using the inc-VAT/inc-iVAT and dec-VAT/dec-iVAT algorithms. These images can be used to visualize the evolving cluster structure and for anomaly detection in streaming data. Our method is illustrated with one synthetic and four real datasets, two of which evolve significantly over time. Our numerical experiments demonstrate the algorithm's ability to successfully identify anomalies and visualize changing cluster structure in streaming data.


Assuntos
Algoritmos , Humanos
3.
IEEE Trans Cybern ; 49(5): 1629-1641, 2019 May.
Artigo em Inglês | MEDLINE | ID: mdl-29994745

RESUMO

Dunn's internal cluster validity index is used to assess partition quality and subsequently identify a "best" crisp partition of n objects. Computing Dunn's index (DI) for partitions of n p -dimensional feature vector data has quadratic time complexity O(pn2) , so its computation is impractical for very large values of n . This note presents six methods for approximating DI. Four methods are based on Maximin sampling, which identifies a skeleton of the full partition that contains some boundary points in each cluster. Two additional methods are presented that estimate boundary points associated with unsupervised training of one class support vector machines. Numerical examples compare approximations to DI based on all six methods. Four experiments on seven real and synthetic data sets support our assertion that computing approximations to DI with an incremental, neighborhood-based Maximin skeleton is both tractable and reliably accurate.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...