
Title
Statistical methods to detect the missing biology and missing heritability of genome-wide association studies
Speaker
Judy Zhong, Sage Bionetworks
Abstract
Genome-wide association studies (GWAS) have achieved great success identifying common genetic variants associated with common human diseases. However, GWAS do not necessarily lead directly to the gene or genes in a given locus associated with disease, and they do not typically inform the broader context in which the disease genes operate, thereby providing limited insights into the mechanisms driving disease. Further, the amount of genetic variation explained by GWAS for a given disease is most often significantly less than the heritability estimated for the disease. Here I present my two research activities trying to find the "missing biology" and "missing heritability". The first method used a permutation approach to leverage information from genetics of gene expression studies and identified biological pathways enriched for expression-associated genetic loci. The second approach evaluates the potential for discovery of genes associated with a complex disease trait by deep sequencing human exons. We used a rigorous population-genetic simulation framework to characterize the allele frequency spectrum of rare variants involved in complex traits and to motivate statistical strategies to identify such variants. We developed a likelihood based flexible p-value threshold test that can effectively combine evidence of association over the variants in a gene. This method optimizes the gene-specific threshold by an efficient permutation algorithm, making it robust with respect to various properties of individual genes. The proposed method was extensively evaluated using population genetic simulations and trait simulations inspired by the Women's Health Initiative Sequencing Project. Compared to several existing methods, the proposed test showed higher power and robustness under various trait properties and the presence of mis-genotyping errors.