Difference-in-differences with variation in treatment timing
Difference-in-differences with variation in treatment timing
Authors: Andrew Goodman-Bacon
Citation: Goodman-Bacon, Andrew (2021). Difference-in-differences with variation in treatment timing. Journal of econometrics, 225(2), 254--277.
Abstract: The canonical difference-in-differences (DD) estimator contains two time periods, ”pre” and ”post”, and two groups, ”treatment” and ”control”. Most DD applications, however, exploit variation across groups of units that receive treatment at different times. This paper shows that the two-way fixed effects estimator equals a weighted average of all possible two-group/two-period DD estimators in the data. A causal interpretation of two-way fixed effects DD estimates requires both a parallel trends assumption and treatment effects that are constant over time. I show how to decompose the difference between two specifications, and provide a new analysis of models that include time-varying controls.
Reading Notes
Objective
To show that differences-in-differences (DD) models with variation in treatment timing end up estimating a weighted average of all possible two group-two period DD estimators
Importance
A lot of us use DD estimators to estimates models with different treatment timing without understanding the way this method works and when it doesn’t
Background
Single-coefficient two-way fixed effects are biased with treatment effects are not uniform. Then you need something like an event-study specification. This paper argues the underlying research design, however, is still a DD with variation in treatment timing
Methodology
The DD estimand is the variance-weighted average treatment effect on the treated (VWATT)
To evaluate changes in estimates across specifications, you can use an Oaxaca-Blinder-Kitagawa decomposition to determine which part of the difference is due to the 2x2 DD variation
The paper also suggests a balance test to determine if the identification assumption (essentially the parallel trends assumption) holds
Results
This method is applied to a replication of Stevenson and Wolfers (2006) divorce and female suicide rate paper. It shows that over 1/3 of the identifying variation comes from treatment timing and the rest comes from treated/control comparisons.
The effect varies over time so -3 suicides per million women is an underestimate, the true estimates using event study specification is around -5