
Title
On Semi-Supervised Joint Trained Elastic Net
Speaker
Mark Culp, West Virginia University
Abstract
Most supervised linear regression techniques optimize parameters on the available training feature data and responses (referred to as labeled data) and then use these parameter settings to predict new observations (referred to as unlabeled data). The supervised approach is arguably most successful when the labeled and unlabeled data come from the same distribution. Many practical circumstances do not afford such an assumption, but in many cases the unlabeled data are available at the time of training. Moreover, it is well understood that the variability of predictions on observations outside of the training data range is quite high for linear techniques especially in high dimensional situations. Supervised approaches are at an inherent disadvantage by not accounting for this information.
In this talk, we derive the joint trained elastic net, which specifically addresses this issue using semi-supervised learning. In semi-supervised learning, one is primarily interested in incorporating the full labeled/unlabeled feature data and the labeled responses to improve prediction. We demonstrate geometrically that this approach shrinks the unlabeled fitted predictions in the directions of highest extrapolation in the unlabeled data. By doing so, the variability in these predictions are decreased relative to that of supervised alternatives. Moreover, in high dimensional data circumstances with large p, the available unlabeled feature data is also shown to be valuable in decorrelation and variable selection. Hence, these two hallmarks of the supervised elastic net are shown to directly extend into the semi-supervised learning setting. The practical application of the joint trained elastic net is shown on two challenging data sets.