BETS: The dangers of selection bias in early analyses of the coronavirus disease (COVID-19) pandemic
*Please be sure to mute upon entry
Qingyuan Zhao, University of Cambridge, Statistical Laboratory, Department of Pure Mathematics and Statistical Mathematics
The coronavirus disease 2019 (COVID-19) has quickly grown from a regional outbreak in Wuhan, China to a global pandemic. Early estimates of the epidemic growth and incubation period of COVID-19 may have been biased due to sample selection. Using detailed case reports from 14 locations in and outside mainland China, we obtained 378 Wuhan-exported cases who left Wuhan before an abrupt travel quarantine. We developed a generative model we call BETS for four key epidemiological events---Beginning of exposure, End of exposure, time of Transmission, and time of Symptom onset (BETS)---and derived explicit formulas to correct for the sample selection. We gave a detailed illustration of why some early and highly influential analyses of the COVID-19 pandemic were severely biased. All our analyses, regardless of which subsample and model were being used, point to an epidemic doubling time of 2 to 2.5 days during the early outbreak in Wuhan. A Bayesian nonparametric analysis further suggests that about 5% of the symptomatic cases may not develop symptoms within 14 days of infection and that men may be much more likely than women to develop symptoms within 2 days of infection. This is based on joint work with Nianqiao Ju, Sergio Bacallado, and Rajen Shah.