Pesquisa | Portal Regional da BVS (teste)

A bag-of-paths framework for network data analysis.

Françoisse, Kevin; Kivimäki, Ilkka; Mantrach, Amin; Rossi, Fabrice; Saerens, Marco.

Neural Netw ; 90: 90-111, 2017 Jun.

Artigo em Inglês | MEDLINE | ID: mdl-28458082

RESUMO

This work develops a generic framework, called the bag-of-paths (BoP), for link and network data analysis. The central idea is to assign a probability distribution on the set of all paths in a network. More precisely, a Gibbs-Boltzmann distribution is defined over a bag of paths in a network, that is, on a representation that considers all paths independently. We show that, under this distribution, the probability of drawing a path connecting two nodes can easily be computed in closed form by simple matrix inversion. This probability captures a notion of relatedness, or more precisely accessibility, between nodes of the graph: two nodes are considered as highly related when they are connected by many, preferably low-cost, paths. As an application, two families of distances between nodes are derived from the BoP probabilities. Interestingly, the second distance family interpolates between the shortest-path distance and the commute-cost distance. In addition, it extends the Bellman-Ford formula for computing the shortest-path distance in order to integrate sub-optimal paths (exploration) by simply replacing the minimum operator by the soft minimum operator. Experimental results on semi-supervised classification tasks show that both of the new distance families are competitive with other state-of-the-art approaches. In addition to the distance measures studied in this paper, the bag-of-paths framework enables straightforward computation of many other relevant network measures.

Assuntos

Redes Neurais de Computação , Probabilidade , Estatística como Assunto/métodos , Algoritmos

An experimental investigation of kernels on graphs for collaborative recommendation and semisupervised classification.

Fouss, François; Francoisse, Kevin; Yen, Luh; Pirotte, Alain; Saerens, Marco.

Neural Netw ; 31: 53-72, 2012 Jul.

Artigo em Inglês | MEDLINE | ID: mdl-22497802

RESUMO

This paper presents a survey as well as an empirical comparison and evaluation of seven kernels on graphs and two related similarity matrices, that we globally refer to as "kernels on graphs" for simplicity. They are the exponential diffusion kernel, the Laplacian exponential diffusion kernel, the von Neumann diffusion kernel, the regularized Laplacian kernel, the commute-time (or resistance-distance) kernel, the random-walk-with-restart similarity matrix, and finally, a kernel first introduced in this paper (the regularized commute-time kernel) and two kernels defined in some of our previous work and further investigated in this paper (the Markov diffusion kernel and the relative-entropy diffusion matrix). The kernel-on-graphs approach is simple and intuitive. It is illustrated by applying the nine kernels to a collaborative-recommendation task, viewed as a link prediction problem, and to a semisupervised classification task, both on several databases. The methods compute proximity measures between nodes that help study the structure of the graph. Our comparisons suggest that the regularized commute-time and the Markov diffusion kernels perform best on the investigated tasks, closely followed by the regularized Laplacian kernel.

Assuntos

Bases de Dados Factuais/classificação , Cadeias de Markov , Estatística como Assunto/classificação , Distribuição Aleatória

The sum-over-paths covariance kernel: a novel covariance measure between nodes of a directed graph.

Mantrach, Amin; Yen, Luh; Callut, Jerome; Francoisse, Kevin; Shimbo, Masashi; Saerens, Marco.

IEEE Trans Pattern Anal Mach Intell ; 32(6): 1112-26, 2010 Jun.

Artigo em Inglês | MEDLINE | ID: mdl-20431135

RESUMO

This work introduces a link-based covariance measure between the nodes of a weighted directed graph, where a cost is associated with each arc. To this end, a probability distribution on the (usually infinite) countable set of paths through the graph is defined by minimizing the total expected cost between all pairs of nodes while fixing the total relative entropy spread in the graph. This results in a Boltzmann distribution on the set of paths such that long (high-cost) paths occur with a low probability while short (low-cost) paths occur with a high probability. The sum-over-paths (SoP) covariance measure between nodes is then defined according to this probability distribution: two nodes are considered as highly correlated if they often co-occur together on the same--preferably short--paths. The resulting covariance matrix between nodes (say n nodes in total) is a Gram matrix and therefore defines a valid kernel on the graph. It is obtained by inverting an n\times n matrix depending on the costs assigned to the arcs. In the same spirit, a betweenness score is also defined, measuring the expected number of times a node occurs on a path. The proposed measures could be used for various graph mining tasks such as computing betweenness centrality, semi-supervised classification of nodes, visualization, etc., as shown in Section 7.

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA