
Title
VIsualization analysis in a high-dimensional space of cross-correlations
Speaker
Kerby Shedden, University of Michigan
Abstract
Suppose we have two complementary sets of high-dimensional measurements made on a common set of units. A familiar problem is to consider the correlations that result after reducing each set of measurements to a scalar summary. Canonical Correlation Analysis (CCA) provides a useful framework, but methods for visualization, regularization, model selection, and formal inference are much less well developed than in regression. Of particular interest is the relationship among members of the set of data reductions giving reasonably high correlations, a question that turns out to be much more interesting than the analogous question in regression analysis. We will present some of our recent work in this area through an analysis of the EPA's new "Toxcast" high-throughput screening dataset. Problems to be discussed include: unbiased estimation of the dominant canonical correlation, exploration of the cross-correlation space in the neighborhood of a scientifically-motivated data reduction, dimension reduction of a cross-correlation space, visualizing paths through cross-correlation space, visualizing level sets in cross-correlation space, and contrasting conditional and unconditional correlations.