Representation, optimization and generalization in deep learning
Peter Bartlett, UC Berkeley
Deep neural networks have improved state-of-the-art performance for prediction problems across an impressive range of application areas, and they have become a central ingredient in AI systems. This talk considers factors that affect their performance, describing some recent results in two directions. First, we investigate the impact of depth on representation and optimization properties of these networks. We focus on deep residual networks, which have been widely adopted for computer vision applications because they exhibit fast training, even for very deep networks. We show that as the depth of these networks increases, they are able to represent a smooth invertible map with a simpler representation at each layer, and that this implies a desirable property of the functional optimization landscape that arises from regression with deep function compositions: stationary points are global optima. Second, we consider the generalization behavior of deep networks, that is, how their performance on training data compares to predictive accuracy. In particular, we aim to understand how to measure the complexity of functions computed by these networks. For multiclass classiﬁcation problems, we present a margin-based generalization bound that scales with a certain margin-normalized “spectral complexity,” involving the product of the spectral norms of the weight matrices in the network. We show how the bound gives insight into the observed performance of these networks in practical problems. Joint work with Steve Evans and Phil Long, and with Matus Telgarsky and Dylan Foster.
Bio: Peter Bartlett is a professor in the Computer Science Division and Department of Statistics and Associate Director of the Simons Institute for the Theory of Computing at the University of California at Berkeley. His research interests include machine learning and statistical learning theory. He is the co-author, with Martin Anthony, of the book Neural Network Learning: Theoretical Foundations. He has served as an associate editor of the journals Bernoulli, the Journal of Artiﬁcial Intelligence Research, the Journal of Machine Learning Research, the IEEE Transactions on Information Theory, Machine Learning, and Mathematics of Control Signals and Systems, and as program committee co-chair for COLT and NIPS. He was awarded the Malcolm McIntosh Prize for Physical Scientist of the Year in Australia in 2001, and was chosen as an Institute of Mathematical Statistics Medallion Lecturer in 2008, and an IMS Fellow and Australian Laureate Fellow in 2011. He was elected to the Australian Academy of Science in 2015.
Note: Seminars are free and open to the public.