Partial Least Squares Regression
Dennis Cook, University of Minnesota
Partial least squares (PLS) regression is a method for fitting a linear regression model to data in which the number of observations need not exceed the number of predictors. Developed about four decades ago, the method is now used across the applied sciences, particularly in Chemometrics where it originated as an algorithm. And yet it seems that in Statistics PLS is mostly viewed as a fringe method, perhaps worth a mention in passing but rarely more.
We will begin with a brief review of the history of PLS regression, highlighting points that connect with later discussion. We will then turn to a formulation that, in contrast to the algorithmic traditions underlying PLS, is more in the modeling tradition of Statistics. This will set the stage for a subsequent discussion of the high-dimensional behavior of PLS regression. In particular, PLS estimators can in some settings converge at the usual root-n rate, regardless of the relationship between the number of observations and the number of predictors.
It will be emphasized that the philosophy underlying PLS regression differs from the notion of sparsity which is a foundation for contemporary high-dimensional methods in Statistics. Small examples will be presented to illustrate general points.
Note: Seminars are free and open to the public. Reception to follow.