 
Title
Regularization and Variable Selection for Multiclass Support Vector Machine and Varying-Coefficient Models with Applications in Genomics
Speaker
Lifeng Wang, University of Pennsylvania
Abstract
Microarray technology has been widely used in biomedical research to study complex biological systems and disease processes. Due to its high-dimensionality, new computational and statistical methods and rigorous theoretical development are required to draw valid inferences from the data. In this talk, I will present two such problems and the methods and theory that we have developed to address these problems. The first problem is related to multiclass classification and variable selection in presence of a very large number of genes. We have proposed a regularized multiclass support vector machine, which performs classification and variable selection simultaneously through an L1-norm penalized sparse representation. A statistical learning theory is developed to quantify the generalization error, where the number of variables is allowed to grow much faster than the sample size. The second problem is related to the identification of transcription factors involved in gene regulation during a given biological process based on the time course gene expression data. To capture the dynamic behavior of gene expression, we propose to use a nonparametric varying-coefficient model for such data and present a regularized estimation procedure for variable selection that combines basis function approximations and the smoothly clipped absolute deviation penalty (SCAD). The proposed procedure simultaneously selects significant variables with time-varying effects and estimates the nonzero smooth coefficient functions. Under suitable conditions, we have established the theoretical properties of our procedure, including consistency in variable selection and the oracle property in estimation. I illustrate these methods with simulations and real data examples.
Meet the speaker in Room 212 Cockins Hall at 4:30 p.m. Refreshments will be served.