
Dissertation Defense: Sean Tomlin
Title: Estimating Causal Effects from Observational Data in Quasi-Experimental Designs
Abstract: Observational studies are essential for evaluating health policies and interventions, especially when randomized trials are not feasible. However, because researchers cannot control how individuals are assigned to treatment or control groups, these studies risk bias from unmeasured confounding. To address this, quasi-experimental methods such as the difference-in-differences (DiD) design are widely used. DiD compares changes in outcomes over time between treated and untreated groups and is often applied to survey or administrative data with repeated measures.
This dissertation addresses several modern challenges in using quasi-experimental methods to estimate causal effects. The first part focuses on the use of matching in multi-period DiD studies. Matching individuals on their pre-intervention outcomes is a common practice, but its statistical implications are not fully understood. This work introduces the concept of counterpart statistics for matched DiD designs and proposes a Bayesian model averaging approach to improve robustness. The method is applied to four years of longitudinal data from the Medical Expenditure Panel Survey (MEPS) to estimate the causal effect of gaining health insurance on overall healthcare utilization.
The second part addresses zero-inflated outcomes, which frequently occur in healthcare utilization data where many individuals report no use of certain services—known as structural zeros. Standard regression models often fail to account for this data structure adequately. Moreover, formal identification of causal effects on both the extensive margin (the likelihood of any use) and the intensive margin (the amount of use among users) under parallel trends assumptions remains underdeveloped. This work introduces hurdle models within a nonlinear DiD framework to estimate these two margins. The approach is illustrated using pooled two-year panels from the MEPS.
The third part examines retrospective causal inference using survey data. Many surveys are designed for descriptive purposes, not causal analysis. Limitations include missing covariates and sampling designs that depend on exposure. This work formalizes common sources of bias and develops a sensitivity analysis method that incorporates survey-bootstrap techniques. The approach is demonstrated using a simulated example.
Overall, this work clarifies key misconceptions in causal analysis using quasi-experiments and equips practitioners with robust tools to tackle common challenges in health services and outcomes research.
Advisor: Bo Lu