I was reading up on Gerber & Green for during the holiday, and I saw a lot of cautionary warnings about mis-analyzing blocked designs. I think the main concerns are when the treatment assignment is correlated with the outcome (pg 116) (as it should be!) and when the probability of assignment varies by block (pg 76).
One thing I saw highly emphasized in the text is that blocking is particularly useful in small sample sizes because it requires less covariate adjusting. However in other designs, I usually see the estimators using
fixed_effects = blocks which seems to subvert some of the usefulness of blocking in the first place.
As an experimenter, is this something I should be worried about when using small sample sizes? or is there a greater point I’m missing?
Is there another built in way in DD to I should consider evaluating blocked designs (perhaps
weights argument?) If so, what’s best way to do this for a simple design.
Hi John Henry,
This is a really important question!
The most straightforward way to analyze a blocked design in which different blocks have different probabilities of assignment is to estimate the average treatment effect in each block separately using difference in means, then aggregate the block level estimates up to the average treatment effect, weighting each block level estimate by the share of the total sample in each block. As Gerber and green show, this split – apply - combine approach is numerically equivalent to inverse probability weighting each unit. Using the “cond_prob” values returned by randomizr or declaredesign, constructing those weights is straightforward.
The alternative is to run an ols regression of outcome on treatment plus a series of indicators for blocks (block fixed effects). This isn’t quite unbiased for the age because it upweights blocks in which the probability of assignment is close to 50/50. See Aronow and Samii ajps on what regression weights are up to for some intuition on this point.
A good practice is to do IPW plus including the block indicators, which are just covariates after all and help to increase precision. There’s a declaredesign blog post about all this here:
Hope that helps!