Learning Influence Representations : Methods and Applciations ; Apprentissage des représentations d'influence : méthodes et applications
Online influence is the plinth of the social networks' effect in our lives and its impact has been steeply increasing. From viral marketing to political campaigns and from news adoption to disease transmission, the way we are influenced by others is more prevalent than ever. In this thesis, we address the problem of efficiently learning and analyzing influence representations for numerous graph mining problems that are apropos.The first half of the thesis is devoted to the problem of influence maximization, an NP-hard combinatorial optimization problem. The aim is to find the nodes in a network that can maximize the spread of information, where the spread is typically defined by random influence probabilities and simple diffusion models. To address this, in the thesis' first part, we devise a node representation learning model based on diffusion cascades along with an adaptation of a traditional influence maximization algorithm that utilizes the output of the model. This framework surpasses competitive methods, evaluated in terms of computational time and the influence of the predicted seeds in cascades of the immediate future.The second part is devoted to learning how to perform influence maximization. We develop a graph neural network that inherently parameterizes an upper bound of influence estimation, and train it on small simulated graphs. We experimentally show that it can provide accurate estimations faster than the alternatives for graphs 10 times larger than the train set. Furthermore, we use the models' predictions and representations to propose three new influence maximization methods. An adaptation of Cost Effective Lazy Forward that surpasses SOTA but with significant computational overhead, a Q-learning model that learns to retrieve seeds sequentially, and a submodular function that acts as proxy for the marginal gain and can be optimized adaptively and greedily with a theoretical guarantee. The latter strikes the best balance between efficiency and accuracy in our experiments.In the second half of ...