Hi,
I recently discovered your collection of packages and think this is a wonderful project.
I looked through the pre-existing sample designs and did not find any that would match a relatively common (yet, fraught) concern:
Detecting causal effects with panel data
Currently, most people use the plm package for estimation - which is fantastic in its own right. However, I was wondering if the DeclareDesign suite can handle this concern, since the combination of fabricatr, estimatr, and DeclareDesign would be fantastic.
For example, I am assessing how elections impact trust in government at the county level with 27 years of data.
In your framework, I can see the following parameters for the setup:
Population: There is a stable population of 3082 counties
Repeated Measurement: There is a balanced panel with data for 27 years per county
Population: Assume, this entire population can be split into either Republican or Democrat counties (where assignment stays constant, i.e. they are considered either Republican or Democrat for all 27 years)
Random Assignment: Assume, every four years, an election occurs which is considered to be a truly “exogenous shock” i.e. all counties of a given party are randomly assigned as “winning election” (e.g Dem County - Dem President) or “losing election” (e.g. Rep County - Dem President) [1]
Causal Model to be Tested: Right after winning an election, “trust in government” increases - and then slowly falls back to baseline (and, vice versa for losing an election).
[1] Of course, incumbency effects would change the probability of being assigned to winning or losing, but presumably this can be added at a later stage.
Based upon the above, would you believe that:
- This kind of setup is easily / natively modeled with the DeclareDesign framework?
- If so, once modeled, am I correct that the fabricatr package could be used for power calculations?
- And, could the estimatr package be used for robust estimation of effects?
- Finally, can the estimatr package be used to address Difference-in-Difference models under such dependency constraints?
If the above is actually true, would you mind pointing me in the right direction on how to tackle this setup?
I would be happy to share the final outcome as a “Design Template” - since such setups are increasingly common in political science and policy frameworks and hopefully could be useful to others as well.
Highlighting some points of concern regarding panel data:
- For accurate inference, standard errors must be clustered at the appropriate level (e.g., County-level or State-level clusters)
- In most panel data, there are time-fixed effects (due to external factors, e.g. Hurricane Katrina or 2007 banking crisis)
- It is necessary to account for cross-sectional dependencies
- It is necessary to account for serial correlation at the county level (as well as differing levels of geographical hierarchy, e.g. State, Census Region) and block level (e.g. Republican / Democrat counties may experience unique changes over time)