Deep Graphical Models and Inference Strategies for the Analysis of Networks Comprising Textual Edges
In this manuscript, we shall develop new methodologies to cluster nodes of networks, possibly holding textual edges. We aim at providing an end-to-end modelling, capable of using the texts exchanged between the nodes as well as the network topology to extract salient information at the core of the dataset. This work is motivated by questions arising in different fields such as social sciences. Gathering and understanding large datasets from social media may help researchers to answer questions, regarding the way a policy may be perceived, for instance. We adopt a probabilistic modelling framework to classify nodes and analyse texts. Among other things, these models provide information on the uncertainty of our estimates as well as a framework that has proven to be robust historically. Furthermore, in order to benefit from the efficiency of deep neural networks to encode complex types of data, our methodologies strive to include them within a probabilistic framework. Several analyses of real data are provided. In particular, during several months preceding the 2017 French presidential election, each publication of one social media, as well as their republications, involved with one of the candidates were gathered to form a data base. Our methodology helps understanding the groups present on the social media as well as the way interactions were taking place during this particular time period. Python implementations associated with the methodologies developed in this manuscript have been made public.