Ohio State is in the process of revising websites and program materials to accurately reflect compliance with the law. While this work occurs, language referencing protected class status or other activities prohibited by Ohio Senate Bill 1 may still appear in some places. However, all programs and activities are being administered in compliance with federal and state law.

Seminar: Chris Hans

Statistics Seminar
October 20, 2011
All Day
209 W. Eighteenth Ave. (EA), Room 170

Title

Penalized Regression via Bayesian Orthant Priors

Speaker

Chris Hans, The Ohio State University

Abstract

Penalized optimization procedures are sometimes described as Bayesian approaches to parameter estimation, prediction and variable selection. In this talk I examine the strength of this connection in the context of the lasso (Tibshirani, 1996) and the elastic net (Zou and Hastie, 2005). Motivated by the proliferation of these (and related) approaches to penalized optimization, I introduce a new class of prior distributions—the "orthant normal" distribution—for the regression coefficients in a Bayesian regression model and show that this prior gives rise to the lasso and elastic net point estimates as the posterior mode. By providing a complete characterization of the prior distribution, I allow for model-based inference that moves beyond exclusive use of the posterior mode, including coherent Bayesian prediction and formal Bayesian model comparison. In contrast to penalized optimization procedures, where values for the penalty parameters are often selected via cross validation, the Bayesian approach allows for uncertainty about these parameters to be explicitly included in the model. I show that the orthant normal distribution has a scale-mixture of normals representation, providing additional insight into the particular form of shrinkage employed by the elastic net. Posterior inference is shown to be easily achieved via MCMC methods. Finally, I generalize the basic orthant normal distribution to allow for a priori dependence in the regression coefficients, yielding a framework for constructing new penalties.  In particular I discuss an approach for incorporating prior information about dependence structure in the covariates that resembles Zellner's g-prior but that also allows for lasso-like shrinkage.