Seminar: Arijit Chakrabarti

Thu, February 19, 2004

All Day

209 W. Eighteenth Ave. (EA), Room 170

Title

Some Contributions to Infinite and High Dimensional Model Selection Problems

Speaker

Arijit Chakrabarti, Purdue University

Abstract

The problem of selecting a model in infinite or high dimensional setup has been of great interest in the recent years. A familiar example where the parameter space is infinite dimensional is the problem of estimating the unknown square-integrable drift function of a Brownian motion in the Gaussian White-Noise model. The high dimensional problems typically arise when the number of possible parameters increases with increasing sample size.

Using a complete orthonormal basis of L_2, the unknown drift function in the White-Noise model can be represented as an infinite linear combination of the basis functions, the cofficients being the (unknown)Fourier coefficients and thus the problem reduces to one of estimating the vector of Fourier coefficients. Using a simple isometry, this problem can be equivalently recast as the problem of estimating the (square-summable)mean vector in an infinite dimensional normal mean problem. In this talk, I will show that model selection by the Akaike Information Criterion(AIC) (where under each model, all but first finitely many coordinates of the mean vector are assumed to be zero), followed by least squares estimation, achieves the asymptotic minimax rate of convergence (over an appropriate subset of the parameter space) for squared error loss. I will also present a Bayesian Model Selection rule followed by Bayes estimates which achieves the same rate of convergence asymptotically.

It is known that the Bayes Information Criterion(BIC) may be an inappropriate model selection criterion and a poor approximation to integrated likelihoods in some high dimensional problems. I will present a generalization GBIC of BIC, which approximates the integrated likelihood upto O(1) and a Laplace Approximation to the integrated likelihood which is correct upto o(1) in a high dimensional setup when the observations come from the general exponential family of distributions. Some simulation results will be presented which show that GBIC performs much better than BIC and the Laplace approximation performs wonderfully well in many examples, including some non-exponential family examples. Finally, I will indicate some areas of application for this approximation method.

This is joint work with Professor Jayanta K. Ghosh.