Learning Information Cascades in Social Networks (Technicolor Paris, Nidhi Hegde)

Social networks are increasingly used for diffusion of information.  Indeed many people rely on their social contacts for receiving new content (videos or music), news updates, event notifications, etc.  Naturally, a network of influence is created where some users are more influential than others.  Influence here is defined as having the most impact of a published item.  The impact is measured through the intensity of propagation of the published item in the network - the number of people reached (or infected), the time to reach "frenzied" diffusion (when the item is very popular, maybe for a short duration).

The identification of such influential nodes has been considered through many perspectives, there is much prior work on rumor centrality and sources of infection, for instance.  Such analysis is however, at a global level.  We are interested in characterizing the influence around a given user, with the aim of creating a social profile of a user.   Including such social information about a user can provide for richer recommendations of contents.  A first step is to characterize how users participate in information cascades.   An important aspect in this problem is the identification of information pathways in social networks, which represent the paths taken by cascades.

There is some recent work on learning cascades of diffusion given a dataset of infection times.  These work however presume that all contents follow the same path.  Since the form of the cascade will depend on user interests, it is more reasonable to suppose that cascades will take different forms depending on types of content.  A video of Obama's speech might take a different path than one on LOL cats dancing.

The objective of this internship will be to create models of multi-type diffusion, in particular learning these diffusion pathways through the analysis of data.    The problem is a specific case of learning multiple graphical models.    We will have access to a large dataset of a video service where users had the ability to send each other recommendations.  We will also have access to ongoing experiments of the deployment of such a service.    Tools from statistical learning will be used for the analysis of the dataset.  Once a methodology for discriminating the various types of pathways is conceived and tested on the dataset, the next step will be in defining and exploiting models of influence.  Specifically, we will design an adaptive content filter that learns a user's influence and filters (recommends and presents) content accordingly.   The objective is an adaptive algorithm that is robust to a user's changing influence.

The expected outcome of the internship is an algorithm for learning multi-type cascades in social networks, and if time permits, the design of an adaptive content filtering algorithm.