Title: Simple Solutions to Missing Data Problems
Seminar Speaker: Sophia Rabe-Hesketh, Distinguished Professor at the Berkeley School of Education and the Graduate Group of Biostatistics at the University of California, Berkeley.
It is often believed that multiple imputation or analysis of all available data via maximum likelihood or Bayesian estimation are the best solutions to missing data problems. Such methods are usually accompanied by a generic missing at random (MAR) statement without elaboration. In this talk, I will show that deletion of some of the available data may be necessary to avoid bias, an extreme example being listwise deletion. A new contribution is that we can “make” the missingness process ignorable under certain MAR violations by deleting some values for some variables (Rabe-Hesketh & Skrondal, Psychometrika, 2023). This data-deletion approach has connections with ordered factorization (Mohan & Pearl, JASA, 2021). Simulations demonstrate that bias due to violations of MAR assumptions can be mitigated by data deletion and that conditional independence tests can guide the choice of approach. Furthermore, by inspecting missingness patterns in the data (and possibly changing them by deletion), we can replace the generic MAR statement by often much weaker explicit conditional independence assumptions.