
Title
The Secondary Analysis of Genetic Data with Applications to Association and Linkage
Speaker
William Stewart, Nationwide Children's Hospital
Abstract
We live in a data-dependent world where complexity has become the norm, and where an unprecedented and increasing amount of computing power lies literally at our fingertips. However, this amazing computational resource has been largely squandered by the mathematical genetics community, which out of convenience and for survival, tend to favor simplicity over complexity.
Therefore, to better utilize the available computing power in a cost-effective way, we have developed several new statistical methods for the secondary analysis of existing high-throughput genetic data. These programs: EAGLET, Haplodrop, POPFAM, and Hemizyg, all represent a unique blend of computer languages, statistical theory, stochastic processes, and deterministic algorithms. EAGLET increases the power and precision of a dense SNP linkage scan, POPFAM combines family-based and case-control information to detect genetic variants that are associated with disease, and Hemizyg identifies carriers of short, rare deletions.
All of the programs that we develop are freely accessible from the web, with user-friendly documentation and tutorials that are continually updated and enhanced as new software and increased functionality become available.
Bill Stewart received his PhD in Statistics with an emphasis in Statistical Genetics at the University of Washington under the advisement of Dr. Elizabeth A. Thompson. He then began a post-doc with Dr. Mike Boehnke and Dr. Goncalo Abecasis at the University of Michigan. Most recently, as an assistant professor of biostatistics at Columbia University, he directed the statistical analysis of several candidate gene and genome-wide studies (both linkage and association). Throughout his career, he has written and developed software for the improved analysis of genetic data with a keen interest in modeling new and/or computationally challenging data. For example, he has developed improved methods for mapping genetic risk variants for complex disorders (e.g. genes and CNVs related to Alzheimer's disease and bipolar disorder, as well as novel variants influencing hypertension). He also has a general interest in analyzing genetic sequence data, deterministic math models, genetic pathways, and several problems that arise in computational molecular biology. Most of his methods employ a rich variety of deterministic and stochastic algorithms, Monte Carlo sampling techniques, and innovative optimization procedures to estimate genetic quantities or test genetic hypotheses of interest.