Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 3 de 3
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
Algorithms Mol Biol ; 19(1): 11, 2024 Mar 12.
Artigo em Inglês | MEDLINE | ID: mdl-38475889

RESUMO

We introduce a new algorithm for constructing the generalized suffix array of a collection of highly similar strings. As a first step, we construct a compressed representation of the matching statistics of the collection with respect to a reference string. We then use this data structure to distribute suffixes into a partial order, and subsequently to speed up suffix comparisons to complete the generalized suffix array. Our experimental evidence with a prototype implementation (a tool we call sacamats) shows that on string collections with highly similar strings we can construct the suffix array in time competitive with or faster than the fastest available methods. Along the way, we describe a heuristic for fast computation of the matching statistics of two strings, which may be of independent interest.

2.
IEEE Trans Pattern Anal Mach Intell ; 45(12): 14131-14143, 2023 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-37549079

RESUMO

In this work, we concentrate on the detection of anomalous behaviors in systems operating in the physical world and for which it is usually not possible to have a complete set of all possible anomalies in advance. We present a data augmentation and retraining approach based on adversarial learning for improving anomaly detection. In particular, we first define a method for generating adversarial examples for anomaly detectors based on Hidden Markov Models (HMMs). Then, we present a data augmentation and retraining technique that uses these adversarial examples to improve anomaly detection performance. Finally, we evaluate our adversarial data augmentation and retraining approach on four datasets showing that it achieves a statistically significant performance improvement and enhances the robustness to adversarial attacks. Key differences from the state-of-the-art on adversarial data augmentation are the focus on multivariate time series (as opposed to images), the context of one-class classification (in contrast to standard multi-class classification), and the use of HMMs (in contrast to neural networks).

3.
Data Brief ; 30: 105436, 2020 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-32258287

RESUMO

Sensor data generated by intelligent systems, such as autonomous robots, smart buildings and other systems based on artificial intelligence, represent valuable sources of knowledge in today's data-driven society, since they contain information about the situations these systems face during their operation. These data are usually multivariate time series since modern technologies enable the simultaneous acquisition of multiple signals during long periods of time. In this paper we present a dataset containing sensor traces of six data acquisition campaigns performed by autonomous aquatic drones involved in water monitoring. A total of 5.6 h of navigation are available, with data coming from both lakes and rivers, and from different locations in Italy and Spain. The monitored variables concern both the internal state of the drone (e.g., battery voltage, GPS position and signals to propellers) and the state of the water (e.g., temperature, dissolved oxygen and electrical conductivity). Data were collected in the context of the EU-funded Horizon 2020 project INTCATCH (http://www.intcatch.eu) which aims to develop a new paradigm for monitoring water quality of catchments. The aquatic drones used for data acquisition are Platypus Lutra boats. Both autonomous and manual drive is used in different parts of the navigation. The dataset is analyzed in the paper "Time series segmentation for state-model generation of autonomous aquatic drones: A systematic framework" [1] by means of recent time series clustering/segmentation techniques to extract data-driven models of the situations faced by the drones in the data acquisition campaigns. These data have strong potential for reuse in other kinds of data analysis and evaluation of machine learning methods on real-world datasets [2]. Moreover, we consider this dataset valuable also for the variety of situations faced by the drone, from which machine learning techniques can learn behavioral patterns or detect anomalous activities. We also provide manual labeling for some known states of the drones, such as, drone inside/outside the water, upstream/downstream navigation, manual/autonomous drive, and drone turning, that represent a ground truth for validation purposes. Finally, the real-world nature of the dataset makes it more challenging for machine learning methods because it contains noisy samples collected while the drone was exposed to atmospheric agents and uncertain water flow conditions.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...