
Title
Second-generation exponential-family models of networks: Scaling up
Speaker
Michael Schweinberger, Penn State University
Abstract
In this talk, I consider two important problems arising in the statistical analysis of networks (e.g., social networks) from lack of scalability.
The first problem is the problem of model degeneracy of statistical exponential-family models with transitivity and other forms of dependence, which has obstructed the statistical analysis of networks more than anything else. The problem of model degeneracy is rooted in the lack of scalability of exponential-family models. I introduce a novel class of second-generation exponential-family models which addresses the lack of scalability of first-generation models and reduces the problem of model degeneracy, while retaining the scientific appeal of first-generation models: i.e., the simplicity and flexibility to model interesting forms of dependence, including transitivity. I discuss a Bayesian approach based on auxiliary-variable Markov chain Monte Carlo methods and demonstrate that second-generation exponential-family models allow to model transitivity without inducing model degeneracy.
The second problem I consider is the lack of scalability of statistical algorithms. I introduce a novel generalized variational EM algorithm which increases scalability by multiple orders of magnitude. The generalized variational EM algorithm takes advantage of the sparsity of networks, convenient convexity properties of exponential-family models, and minorization-maximization (MM) algorithms. I apply the generalized variational EM algorithm to a World Wide Web network with more than 131,000 nodes and 17 billion edge variables.