Gentle introduction to DeclareDesign (by Dorothy Bishop)

Gentle introduction to DeclareDesign

DVM Bishop
GitHub: oscci
Twitter: deevybee
22nd September 2018

Huge thanks to Dorothy Bishop for writing this guide. Dorothy gave the DeclareDesign and DesignLibrary packages a spin and noticed how much of the action goes on behind the scenes. DeclareDesign uses a pipe structure to pass dataframes along from step to step, but this pipe structure is mostly invisible making it very hard to see what DeclareDesign is actually doing. Very kindly, in this guide Dorothy walks you through the pipe structure, showing what happens when a design gets “run” and a diagnosis gets implemented.

For getting set up to use DeclareDesign, see this website.

We’ll be looking at a simplified version of the code that you can find here. For the sake of clarity, I’ve modified the script in parts.

The script analyses the effectiveness of an experimental design that is used to assess an intervention effect by comparing the pre- vs post-test difference score for treated vs untreated cases. For instance, you could think of this as testing the effectiveness of brain training on a memory test. We take a sample of people and measure their memory at time 1. Half of them are then given the brain-training and the others have a control intervention. Memory is measured again at time 2. If the training is effective, we expect the improvement in memory from time 1 to time 2 to be higher for the trained than for the untrained people.

Note that using change scores is not the optimal way to analyse intervention data, but it provides a simple example of how the package works. Later on we shall see how DeclareDesign makes it easy to compare different approaches to analysing trial data, in terms of their efficiency and power.

The user is asked to estimate certain features of the results of the study, so that data can be simulated with those characteristics. This can then be used to evaluate how effective the design is in testing the hypothesis of interest - in this case the treatment effect.

Before we go further, you should include the R packages you need. If you have not already installed these, you will need to do that first. Go to Tools|Install Packages on the menu bar and type the names of the three packages that have ‘library’ commands below. Then run the lines with those commands.


DeclareDesign differentiates four aspects of experimental design that are not always distinguished. The first of these is the Model of the world (M). In our example, the Model for the study requires us to specify a sample size (N), and pre- and post-test scores for those who do and don’t have the intervention.

It makes life simpler if we think of the test scores as normal random deviates - i.e., z-scores - where the initial pre-test values have mean of zero and SD of one.

Regardless of intervention, we expect people’s scores to correlate between time 1 and time 2: i.e., there will some stability in rank ordering of people’s memory skills across the two measurement occasions. You have to provide an estimate of this relationship as a correlation coefficient.

In addition, an intervention effect will be expected just for the treated group: you have to provide an estimate of this in standard deviation units - i.e. as Cohen’s d. In the script this is referred to as the average treatment effect (ate).

In intervention studies, it is very rare for all the participants to remain in the study up to follow-up. A final parameter to be estimated is the attrition rate - i.e. the proportion of cases that will be lost to follow-up.

Let’s start by specifying these sample characteristics for our Model.

N <- 100 # sample size
ate <- 0.25 # average treatment effect
sd_1 <- 1 # SD for group 1
sd_2 <- 1 # SD for group 2
rho <- 0.5 # correlation between time1 and time2
attrition_rate <- .1 # no outcome data for N*attrition_rate subjects

In the next chunk of code, we input these parameters into our model. When we do this using the declare_population() function, we create a data frame which contains simulated data. Usually, when we run scripts from DeclareDesign, we would not see the data frame that is gradually built up as the script proceeds, but to improve understanding of the functions, we will inspect the simulated data for each chunk, looking just at the first few rows that are created.

# M: Model
population <- declare_population(
  N    = N,
  u_t1 = rnorm(N)*sd_1,
  u_t2 = rnorm(N, rho * u_t1, sqrt(1 - rho^2))*sd_2

df1 <- population()
ID u_t1 u_t2
001 0.3722855 -0.4981104
002 -0.4700278 -0.3207423
003 -0.7181537 -2.0456900
004 0.7210037 1.4360184
005 -1.3461157 -1.6070232
006 0.2275517 -0.3872682

You will see that we have a data frame with four columns. There are 100 rows (or as many as you have specified in N), but we just display the first six. The first column is just a subject identifier. We then have the two columns u_t1 and u_t2. You can think of these as the underlying variables from which the observed test scores will be derived.

We specified that there should be a correlation of rho between u_t1 and u_t2. Let’s see what this looks like:

mytitle <- paste0('r = ', round(cor(df1$u_t1, df1$u_t2), 3))
plot(df1$u_t1, df1$u_t2, main = mytitle, col = 2, pch = 16)


It is unlikely that the correlation will be exactly the one you specified. That is because we are taking a sample from a population. You can think of the population as an enormous number of paired observations of u_t1 and u_t2 from which we are just selecting a subset of pairs. The larger our sample, the closer the observed correlation will be to the value of rho that was specified. Unless you set the random number generator to give the same answer every time (see set.seed command), the specific values of u_t1 and u_t2 will change every time you run the script.

Next, we need to define the potential outcomes of the study – i.e., the outcome that unit 1 exhibits when assigned to the treatment versus when they are assigned to the control. We first define the treatment and control potential outcomes at the pre-test phase, (Y_t1). In the chunk below, you’ll notice that our pretest score potential outcomes, Y_t1_Z_0 and Y_t1_Z_1, are identical to u_t1. Although this may seem redundant, there are two good reasons to specify the outcomes in that way. First, specifying the pre-treatment outcomes as identical in the pre-test period makes explicit our assumption that the treatment only affects outcomes in the second period. Thus, we’ve explicitly ruled out anticipation effects.

Second, in general it is helpful to keep separate the original underlying variables that are generated and the observed variables that we will enter into the analysis. This provides the flexibility to have observed variables that are not normally distributed. We could, for example, generate Y_t1 as a binary variable through a probit transformation of u_t1 by changing the declaration of Y_t1 in our population to draw_binary(N = N,latent = u_t1,link = "probit").[1]

Y_t2 is not a simple function of u_t2, because it depends on the intervention. For those who have the intervention, we need to add the average treatment effect, ate, to u_t2 whenever the treatment condition, Z, is specified as 1. We have not yet specified which individuals have treatment, but we create two new columns giving potential outcomes that tell us what would happen if a case were treated or untreated.

Again, we will inspect the first few rows, and see the new columns created by the data frame. We’ll also plot the values for the potential outcomes to observe the treatment effect. Note that the points for the treated condition are simply increased by an amount equivalent to ate. When we come to the data strategy step below, we will allocate cases as treated or not treated, but at the model stage, we are just representing potential outcomes.

potential_outcomes <- declare_potential_outcomes(
  Y_t1_Z_0 = u_t1,
  Y_t1_Z_1 = u_t1,
  Y_t2_Z_0 = u_t2,
  Y_t2_Z_1 = u_t2 + ate)

df1 <- potential_outcomes(df1)
ID u_t1 u_t2 Y_t1_Z_0 Y_t1_Z_1 Y_t2_Z_0 Y_t2_Z_1
001 0.3722855 -0.4981104 0.3722855 0.3722855 -0.4981104 -0.2481104
002 -0.4700278 -0.3207423 -0.4700278 -0.4700278 -0.3207423 -0.0707423
003 -0.7181537 -2.0456900 -0.7181537 -0.7181537 -2.0456900 -1.7956900
004 0.7210037 1.4360184 0.7210037 0.7210037 1.4360184 1.6860184
005 -1.3461157 -1.6070232 -1.3461157 -1.3461157 -1.6070232 -1.3570232
006 0.2275517 -0.3872682 0.2275517 0.2275517 -0.3872682 -0.1372682

We now move to the Inquiry part of the model. Here we just have to specify the true underlying quantity we want to estimate. In this case, we specify two. The first is the average treatment effect in the post-test period (ATE), defined as the true mean post-test difference between the treatment and control potential outcomes of the dependent variable, Y. We also define the true underlying difference-in-differences (DiD) as an estimand, i.e., the true average difference in pre-post differences between treatment and control. In fact, because potential outcomes are identical in the pre-test period, these estimands are equivalent to one another. If we incorporated a violation of the no-anticipation effects assumption, for example, the estimands would no longer be equivalent. Specifying our inquiry thus helps clarify the assumptions on which our inferences are based.

# I: Inquiry
estimand <- declare_estimand(ATE = mean(Y_t2_Z_1 - Y_t2_Z_0),
                             DiD = mean((Y_t2_Z_1 - Y_t1_Z_1) - (Y_t2_Z_0 - Y_t1_Z_0)))
estimand_label estimand
ATE 0.25
DiD 0.25

The next step is termed the Data Strategy. At this step we determine how individuals are assigned to groups, and we also take into account any sample attrition. In effect, we determine which cases will feature in which groups in the analysis.

By looking at how df1 changes at each step of the script, we can see what is being achieved.

First, we declare how cases are assigned to treatments (Z). The variable Z_cond_prob is the conditional probability that a given case is in the treatment group, given it was assigned to treatment (and vice versa for control).[2] The default is for 50% to be assigned to each of two groups. In effect, the script acts to toss a coin that determines assignment of each case to Z = 1 or Z = 0.

# D: Data Strategy
assignment <- declare_assignment(prob = .5)

df1 <- assignment(df1)
ID u_t1 u_t2 Y_t1_Z_0 Y_t1_Z_1 Y_t2_Z_0 Y_t2_Z_1 Z Z_cond_prob
001 0.3722855 -0.4981104 0.3722855 0.3722855 -0.4981104 -0.2481104 0 0.5
002 -0.4700278 -0.3207423 -0.4700278 -0.4700278 -0.3207423 -0.0707423 1 0.5
003 -0.7181537 -2.0456900 -0.7181537 -0.7181537 -2.0456900 -1.7956900 1 0.5
004 0.7210037 1.4360184 0.7210037 0.7210037 1.4360184 1.6860184 1 0.5
005 -1.3461157 -1.6070232 -1.3461157 -1.3461157 -1.6070232 -1.3570232 0 0.5
006 0.2275517 -0.3872682 0.2275517 0.2275517 -0.3872682 -0.1372682 0 0.5

Next, we take into account the attrition rate. New columns are added to the simulation: R specifies whether or not the case is lost to follow-up, according to the attrition rate we specified. Since attrition is set to only 10%, you probably will only see cases where R = 1 in the brief part of the data frame that is shown below, but 10% of cases will have R set to zero. (If you wish you can inspect all values by typing df1$R at the console).

report <- declare_assignment(prob = (1 - attrition_rate),
                             assignment_variable = R)

df1 <- report(df1)
ID u_t1 u_t2 Y_t1_Z_0 Y_t1_Z_1 Y_t2_Z_0 Y_t2_Z_1 Z Z_cond_prob R R_cond_prob
001 0.3722855 -0.4981104 0.3722855 0.3722855 -0.4981104 -0.2481104 0 0.5 1 0.9
002 -0.4700278 -0.3207423 -0.4700278 -0.4700278 -0.3207423 -0.0707423 1 0.5 1 0.9
003 -0.7181537 -2.0456900 -0.7181537 -0.7181537 -2.0456900 -1.7956900 1 0.5 1 0.9
004 0.7210037 1.4360184 0.7210037 0.7210037 1.4360184 1.6860184 1 0.5 1 0.9
005 -1.3461157 -1.6070232 -1.3461157 -1.3461157 -1.6070232 -1.3570232 0 0.5 1 0.9
006 0.2275517 -0.3872682 0.2275517 0.2275517 -0.3872682 -0.1372682 0 0.5 1 0.9

Next, we need to create the actual values of Y_t1 and Y_t2 that apply with this assignment - for cases where Z is 0 we use the values of Y_t1_Z_0 and Y_t2_Z_0, and for cases where Z is 1 we use the value of Y_t1_Z_1 and Y_t2_Z_1. The declare_reveal function automatically reveals the potential outcomes corresponding to a given assignment of treatment.

reveal_t1 <- declare_reveal(Y_t1) 
reveal_t2 <- declare_reveal(Y_t2) 

df1 <- reveal_t1(df1)
df1 <- reveal_t2(df1)

ID u_t1 u_t2 Y_t1_Z_0 Y_t1_Z_1 Y_t2_Z_0 Y_t2_Z_1 Z Z_cond_prob R R_cond_prob Y_t1 Y_t2
001 0.3722855 -0.4981104 0.3722855 0.3722855 -0.4981104 -0.2481104 0 0.5 1 0.9 0.3722855 -0.4981104
002 -0.4700278 -0.3207423 -0.4700278 -0.4700278 -0.3207423 -0.0707423 1 0.5 1 0.9 -0.4700278 -0.0707423
003 -0.7181537 -2.0456900 -0.7181537 -0.7181537 -2.0456900 -1.7956900 1 0.5 1 0.9 -0.7181537 -1.7956900
004 0.7210037 1.4360184 0.7210037 0.7210037 1.4360184 1.6860184 1 0.5 1 0.9 0.7210037 1.6860184
005 -1.3461157 -1.6070232 -1.3461157 -1.3461157 -1.6070232 -1.3570232 0 0.5 1 0.9 -1.3461157 -1.6070232
006 0.2275517 -0.3872682 0.2275517 0.2275517 -0.3872682 -0.1372682 0 0.5 1 0.9 0.2275517 -0.3872682

Now we compute the difference score for each individual, which is the statistic we are using in our analysis. This is computed as the difference between Y at time 1 and Y at time 2. The beeswarm plot below shows the scores for all cases in the two groups. Our specified attrition rate will determine the proportion of cases lost to follow-up, shown in the plot as black dots. The mean effect size of treatment is shown both for the complete sample and the available sample after excluding those lost to attrition. Note this is unlikely to be exactly equal to the average treatment effect (ate), because the estimate is based on a sample, whereas ate is defined in terms of the population.

manipulation <- declare_step(difference = (Y_t2 - Y_t1), handler = fabricate)  
df1 <- manipulation(df1)

ID u_t1 u_t2 Y_t1_Z_0 Y_t1_Z_1 Y_t2_Z_0 Y_t2_Z_1 Z Z_cond_prob R R_cond_prob Y_t1 Y_t2 difference
001 0.3722855 -0.4981104 0.3722855 0.3722855 -0.4981104 -0.2481104 0 0.5 1 0.9 0.3722855 -0.4981104 -0.8703959
002 -0.4700278 -0.3207423 -0.4700278 -0.4700278 -0.3207423 -0.0707423 1 0.5 1 0.9 -0.4700278 -0.0707423 0.3992855
003 -0.7181537 -2.0456900 -0.7181537 -0.7181537 -2.0456900 -1.7956900 1 0.5 1 0.9 -0.7181537 -1.7956900 -1.0775362
004 0.7210037 1.4360184 0.7210037 0.7210037 1.4360184 1.6860184 1 0.5 1 0.9 0.7210037 1.6860184 0.9650147
005 -1.3461157 -1.6070232 -1.3461157 -1.3461157 -1.6070232 -1.3570232 0 0.5 1 0.9 -1.3461157 -1.6070232 -0.2609075
006 0.2275517 -0.3872682 0.2275517 0.2275517 -0.3872682 -0.1372682 0 0.5 1 0.9 0.2275517 -0.3872682 -0.6148200
mydiff <- round(mean(df1$difference[df1$Z==1])-mean(df1$difference[df1$Z==0]),3)
mytemp <- filter(df1, R==1)
mean1 <- mean(mytemp$difference[mytemp$Z==1])
mean0 <- mean(mytemp$difference[mytemp$Z==0])
mydiff2 <- round((mean1-mean0),3)
mytitle <- paste0('Average effect without attrition= ', mydiff,'\nAverage effect with attrition=',mydiff2)
beeswarm(df1$difference~df1$Z,xlab='Treatment',pwcol=(df1$R+1),pch=16,main=mytitle,ylab='Memory score difference')

segments(.8, mean0, x1=1.2,y1 = mean0,lty=1,col=1,lwd=2) #add line for mean
segments(1.8, mean1, x1=2.2,y1 = mean1,lty=1,col=1,lwd=2)
text(1.5,-1,'Black dots are\n cases lost to \nfollow-up\n(R = 0)')


The final stage is the Answer strategy. This specifies the statistical procedure used to evaluate the inquiry, using the data simulated from the model. In the chunk below, we can see the output from the Answer strategy. You can if you wish view the computed variable mylm, which shows that the output from estimated_effect(df1) is identical to the result obtained by running the code for the robust regression analysis on df1, after excluding those with values of R = 0 (i.e., the cases lost to follow-up).

# A: Answer Strategy
# Use difference score between pretest and posttest
estimated_effect <- declare_estimator(
  difference ~ Z,
  model = lm_robust,
  estimand = c("ATE", "DiD"),
  subset = R == 1,
  label = "Change score"

kable(estimated_effect(df1)) # print the results of the regression
estimator_label term estimate std.error statistic p.value conf.low conf.high df outcome estimand_label
Change score Z 0.4070568 0.1904039 2.137859 0.0353006 0.028669 0.7854446 88 difference ATE
Change score Z 0.4070568 0.1904039 2.137859 0.0353006 0.028669 0.7854446 88 difference DiD
mylm <- lm_robust(formula = difference ~ Z, data = subset(df1,R == 1))

Evaluating the design

We’ve worked through the analysis building up simulated data for one run, but the principal purpose of this exercise is to evaluate the design by repeatedly running the simulation so that the distribution of statistics can be obtained.

To do this, you specify the design by creating a list of all the processes that were involved above, joined by +.

The diagnose_design() function then computes various diagnostic statistics for that design. The default number of simulations is 500. This chunk of code takes time to run, with the amount of time dependent on the number of simulations.

We can then see the diagnostic statistics for the design. In this script we’ve assigned the diagnostic information to a variable, diagnosis1, so we can display it differently, but usually you’d just have this displayed automatically on the console, using the command diagnose_design(pretest_posttest_design).

# Design
pretest_posttest_design <- population + potential_outcomes + estimand +
  assignment + reveal_t1 + reveal_t2 + report + manipulation +

diagnosis1 <- diagnose_design(pretest_posttest_design)
Design Label Estimand Label Estimator Label Term N Sims Bias RMSE Power Coverage Mean Estimate SD Estimate Mean Se Type S Rate Mean Estimand
pretest_posttest_design ATE Change score Z 500 0.02 0.22 0.25 0.93 0.27 0.22 0.21 0.01 0.25
(0.01) (0.01) (0.02) (0.01) (0.01) (0.01) (0.00) (0.01) (0.00)
pretest_posttest_design DiD Change score Z 500 0.02 0.22 0.25 0.93 0.27 0.22 0.21 0.01 0.25
(0.01) (0.01) (0.02) (0.01) (0.01) (0.01) (0.00) (0.01) (0.00)

With the design in hand, we can also use it to quickly generate data, estimates, and estimands. You can try running the code below multiple times. You’ll get different answers because the experiment is being simulated from start to finish each time.

ID u_t1 u_t2 Y_t1_Z_0 Y_t1_Z_1 Y_t2_Z_0 Y_t2_Z_1 Z Z_cond_prob Y_t1 Y_t2 R R_cond_prob difference
001 -0.3536787 0.8582277 -0.3536787 -0.3536787 0.8582277 1.1082277 0 0.5 -0.3536787 0.8582277 1 0.9 1.2119065
002 0.4119265 -0.5881840 0.4119265 0.4119265 -0.5881840 -0.3381840 0 0.5 0.4119265 -0.5881840 1 0.9 -1.0001106
003 -1.1111596 -0.2637439 -1.1111596 -1.1111596 -0.2637439 -0.0137439 1 0.5 -1.1111596 -0.0137439 1 0.9 1.0974157
004 -0.1743786 -0.2856948 -0.1743786 -0.1743786 -0.2856948 -0.0356948 1 0.5 -0.1743786 -0.0356948 1 0.9 0.1386838
005 0.9590143 0.7943895 0.9590143 0.9590143 0.7943895 1.0443895 1 0.5 0.9590143 1.0443895 1 0.9 0.0853752
006 0.7166857 -0.1869223 0.7166857 0.7166857 -0.1869223 0.0630777 1 0.5 0.7166857 0.0630777 0 0.1 -0.6536080
estimator_label term estimate std.error statistic p.value conf.low conf.high df outcome estimand_label
Change score Z 0.495138 0.2046965 2.418889 0.0176274 0.0883468 0.9019292 88 difference ATE
Change score Z 0.495138 0.2046965 2.418889 0.0176274 0.0883468 0.9019292 88 difference DiD
estimand_label estimand
ATE 0.25
DiD 0.25

Interpreting diagnostic information

To understand the output, it helps to read the working paper.

I’ve transposed the output and added some explanation in the chunk below.

#Add a column for explanation
myoutput$explanation[5]<-'Expected difference between estimate and estimand'
myoutput$explanation[7]<-'Root mean squared error'
myoutput$explanation[9]<-'Probability of rejecting null hypothesis of no effect'
myoutput$explanation[11]<-'Probability that  conf. interval contains estimand'
myoutput$explanation[19]<-'Probability that a significant estimate has incorrect sign'
Estimate explanation
design_label pretest_posttest_design .
estimand_label ATE .
estimator_label Change score .
term Z .
bias 0.02072121 Expected difference between estimate and estimand
se(bias) 0.009920455 .
rmse 0.2217726 Root mean squared error
se(rmse) 0.007138123 .
power 0.254 Probability of rejecting null hypothesis of no effect
se(power) 0.01851247 .
coverage 0.932 Probability that conf. interval contains estimand
se(coverage) 0.01172167 .
mean_estimate 0.2707212 .
se(mean_estimate) 0.009920455 .
sd_estimate 0.2210236 .
se(sd_estimate) 0.00714128 .
mean_se 0.2101448 .
se(mean_se) 0.0006704137 .
type_s_rate 0.007874016 Probability that a significant estimate has incorrect sign
se(type_s_rate) 0.007321023 .
mean_estimand 0.25 .
se(mean_estimand) 0 .
n_sims 500 .

Design library

The goal of this explanation is to give you some understanding of the processes involved in DeclareDesign; this will make it easier for you to modify scripts for specific purposes. In practice, however, for the main types of common design, you can run the same processes with minimal scripting by using the functions in the DesignLibrary package, which simplifies the process of using DeclareDesign.

In effect, you just have to specify the key parameters, and the rest is done for you by the pretest_posttest_designer() function. For example, if you wanted to see what would happen to power if the attrition rate increased to 20%, you could look at diagnose_design(pretest_posttest_designer(attrition_rate = .20)). Or you could compare power under three different assumptions about how well your baseline predicts the endline using diagnose_design(expand_design(pretest_posttest_designer,rho = c(.2,.5,.8))).

Note that with the pretest_posttest_designer() function, you get diagnoses for three possible ways of running the analysis. The first is ‘Change score’, which is the same as we have used above - you compare the pretest-posttest difference for the two treatment conditions. An alternative approach is to compare the groups on the posttest score, treating the pretest score as a covariate: this is labelled ‘Condition on pretest’. The final approach is to just use posttest scores, ignoring pretest data: this is ‘Posttest only’.

The Design Library is very helpful for understanding strengths and weaknesses of different approaches.

mydes <- pretest_posttest_designer(N = 100, ate = 0.25, sd_1 = 1, sd_2 = 1,
                                   rho = 0.5, attrition_rate = 0.1)
mydeslib <- diagnose_design(mydes)

X1 X2 X3
design_label mydes mydes mydes
estimand_label ATE ATE ATE
estimator_label Change score Condition on pretest Posttest only
term Z Z Z
bias 0.005895015 0.001104015 -0.004828581
se(bias) 0.008142163 0.008124608 0.010158817
rmse 0.2058560 0.1850635 0.2137697
se(rmse) 0.006757644 0.005390701 0.006427614
power 0.220 0.262 0.248
se(power) 0.01881289 0.02170804 0.01898218
coverage 0.95 0.95 0.93
se(coverage) 0.009686312 0.009628640 0.011327610
mean_estimate 0.2558950 0.2511040 0.2451714
se(mean_estimate) 0.008142163 0.008124608 0.010158817
sd_estimate 0.2059776 0.1852456 0.2139292
se(sd_estimate) 0.006735176 0.005396290 0.006391084
mean_se 0.2122948 0.1844499 0.1997280
se(mean_se) 0.0007206330 0.0006652485 0.0007557927
type_s_rate 0 0 0
se(type_s_rate) 0 0 0
mean_estimand 0.25 0.25 0.25
se(mean_estimand) 0 0 0
n_sims 500 500 500

Session information

Session information is included for technical reasons: it can help diagnose issues if a script does not run in a given environment.

## R version 3.5.1 (2018-07-02)
## Platform: x86_64-apple-darwin15.6.0 (64-bit)
## Running under: macOS High Sierra 10.13.5
## Matrix products: default
## BLAS: /Library/Frameworks/R.framework/Versions/3.5/Resources/lib/libRblas.0.dylib
## LAPACK: /Library/Frameworks/R.framework/Versions/3.5/Resources/lib/libRlapack.dylib
## locale:
## [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## other attached packages:
##  [1] bindrcpp_0.2.2       knitr_1.20           beeswarm_0.2.3      
##  [4] DesignLibrary_0.1.1  forcats_0.3.0        stringr_1.3.1       
##  [7] dplyr_0.7.6          purrr_0.2.5          readr_1.1.1         
## [10] tidyr_0.8.1          tibble_1.4.2         ggplot2_3.0.0       
## [13] tidyverse_1.2.1      DeclareDesign_0.11.0 estimatr_0.11.0     
## [16] fabricatr_0.5.0      randomizr_0.16.1    
## loaded via a namespace (and not attached):
##  [1] tidyselect_0.2.4   listenv_0.7.0      haven_1.1.2.9000  
##  [4] lattice_0.20-35    colorspace_1.3-2   htmltools_0.3.6   
##  [7] yaml_2.2.0         rlang_0.2.1        pillar_1.3.0      
## [10] glue_1.3.0         withr_2.1.2        modelr_0.1.2      
## [13] readxl_1.1.0       bindr_0.1.1        plyr_1.8.4        
## [16] munsell_0.5.0      gtable_0.2.0       cellranger_1.1.0  
## [19] rvest_0.3.2        future_1.9.0       codetools_0.2-15  
## [22] evaluate_0.11      parallel_3.5.1     highr_0.7         
## [25] broom_0.5.0        Rcpp_0.12.18       scales_0.5.0      
## [28] backports_1.1.2    jsonlite_1.5       hms_0.4.2         
## [31] digest_0.6.15      stringi_1.2.4      grid_3.5.1        
## [34] rprojroot_1.3-2    cli_1.0.0          tools_3.5.1       
## [37] magrittr_1.5       lazyeval_0.2.1     Formula_1.2-3     
## [40] future.apply_1.0.0 crayon_1.3.4       pkgconfig_2.0.1   
## [43] xml2_1.2.0         lubridate_1.7.4    assertthat_0.2.0  
## [46] rmarkdown_1.10     httr_1.3.1         rstudioapi_0.7    
## [49] globals_0.12.1     R6_2.2.2           nlme_3.1-137      
## [52] compiler_3.5.1
  1. Note that, generally speaking, non-linear transformations of u_t1
    and u_t2 might lead to correlations between Y_t1 and Y_t2 that
    are not equal to rho in expectation.

  2. For example, if the probability of assignment was .8, then units
    assigned to the control would have Z_cond_prob = .2 and those
    assigned to treatment would have Z_cond_prob = .8. This is useful
    for designs where you want to weight observations by their
    assignment probabilities.

1 Like

A couple of quick questions:

  1. What is the purpose of handler = fabricate in the declare_step function?

  2. Does the declare_estimator support any model that uses a formula? Is there a full list of models that work with the function?

  3. Can you expand on the relationship between the declare_step and declare_estimand?

  4. Offhand, what are other common estimands besides the d.i.m and d.i.d?

Hi JohnHenry! Thanks for the questions.

  1. What is the purpose of handler = fabricate in the declare_step function?
    All functions in DD create a function that takes data and either returns data or a summary of data. Many of these are really common steps, like assignment, that have lots of shortcuts for users (like, prob = .5).

declare_step is our generic function for creating a step that takes data, does a thing, and returns data. In this case, the function that “handles” the arguments you provide to declare_step is fabricatr::fabricate: it’s going to be just like if you wrote fabricate(difference = (Y_t2 - Y_t1),data = data). Here, we’re creating a variable, called difference, that is used as an outcome – it’s not an estimand (note that Y_t2 is the observed outcome at time 2, and Y_t1 is the observed outcome at time 1).

  1. Does the declare_estimator support any model that uses a formula? Is there a full list of models that work with the function?

declare_estimator will produce nice and correct output for any model that has a tidy method associated with it – see here: (list is non-exhaustive as some packages have tidy methods built in them). see ?DeclareDesign::declare_estimator for how to give models to declare_estimator without tidy methods. In principle, it can handle anything that you can make a data.frame out of (see, for example, the crazy estimators in this post:

  1. Can you expand on the relationship between the declare_step and declare_estimand ?

declare_step takes data, does stuff to it, and returns data. declare_estimand takes data, and usually returns a summary of unobserved potential outcomes (e.g., the average of individual’s treatment effects in the sample, or SATE). Here, it probably looks like they’re similar because difference is an outcome variable. For more on tradeoffs using change-scores (differences) as outcomes versus controlling for baseline, see:

  1. Offhand, what are other common estimands besides the d.i.m and d.i.d?

First, I’m not sure I’d think of difference-in-means as a common estimand but rather as a common estimator – usually DIM is trying to shoot at the ATE, using the fact that the expectation of the difference in potential outcomes is the same as the difference in the expectations of the potential outcomes: DIM = E[Y(1)] - E[Y(0)] = E[Y(1)-Y(0)] = ATE (in an experiment where a few assumptions are met).

So, ATE is a common estimand – but there’s a difference about where you define it: Sample ATEs are different from Population ATEs, and may be very different depending on how the sample and population are defined.

Then there’s the big class of conditional ATEs, such as the ATE conditional on being a “complier”. See the literature on principal stratification, and relatedly, the notion of a “local” average treatment effect evoked in the literature on IV:

More exotic estimands can be found, such as the difference in modes or medians when you have categorical variables:

We discuss some even crazier estimands in the DD paper:
Such as a function f* that you try to estimate (p28).

Hope this helps!

1 Like

thank you for this tutorial. I found it very helpul, since I still don’t completely get what parameters and steps are necessery if I want to build my own design.
Are you planning on sharing further gentle intros?

Hi I wrote a pretty gentle intro I’d like to get feedback on, maybe it would be helpful for you:

1 Like

@johnhenry The diagram you made showing all the functions is fantastic, it would work nicely in a cheat-sheet / reference card format as well.