I have been using DeclareDesign to set up a multi-armed design where:
- the DV has a negative binomial distribution
- glm.nb from MASS is used for the estimator.
My main question is how best to make comparisons between arms in my estimators?
Following the patterns in the example for multi-armed experiments, I have been using the subset argument to glm.nb to compare specific arms to other arms.
On my OSX machine with the latest R and DeclareDesign, I am able to define the following estimator (for example, to compare the third arm to the first):
declare_estimator(formula = Y ~ Z == 2, estimand="ate_Y_2_0",
subset = data$Z %in% c(0,2) ,
model = glm.nb, label="Y_2_0.negbin")
However, when I run this code on an Ubuntu 14.04 machine with exactly the same R version, DeclareDesign version, and dplyr version, I get the following error:
ERROR while rich displaying an object: Error in data$Z: object of type āclosureā is not subsettable
According to this StackOverflow thread, the error could be a consequence of DeclareDesign having the convention of using data
as the variable name for data. Iāve tried a few different ways of defining the subset criteria, and all of them produce errors on the Ubuntu machine.
The purpose of this question: my main goal is to specify a multi-armed experiment that involves a negative binomial estimator, with code that runs reliably on different machines. Iām still learning DeclareDesign, and there may be a better way to set this up. If a particular approach will get me to my goal without substantial debugging, I would be very happy. I have included full details, just in case a solution does require debugging.
Full code example (This code runs on OSX but not on Ubuntu)
#Note: I know the estimand that uses effect.a.b.irr is incorrectly calculated. I was working to test some potential corrections on a faster machine when I encountered this problem
#Note: I have reduced the n.size to speed up the debugging
library(MASS)
library(DeclareDesign)
# Return the difference in mu associated with an incidence rate ratio
# from a negative binomial model. This difference can then be used to
# simulate a negative binomial distribution for the effect of a given irr
#
#` @param mu The baseline mu in question
#` @param irr The incidence rate ratio in question
mu.diff.from.mu.irr <- function(mu, irr){
mu*(irr-1)
}
effect.a.irr <- 1.15
effect.b.irr <- 1.1
baseline.mu <- 100
baseline.theta = 0.155
effect.a.b.irr <- (baseline.mu + mu.diff.from.mu.irr(mu.baseline,effect.b.irr)) / (baseline.mu + mu.diff.from.mu.irr(baseline.mu,effect.a.irr))
n.size = 10000
design <-
declare_population(
N = n.size,
group = draw_binary(N, prob=0.5),
u = baseline.mu,
theta = baseline.theta
) +
declare_potential_outcomes(
Y_Z_0 = rnegbin(n = N,mu=u ,theta=theta),
Y_Z_1 = rnegbin(n=N,mu= u + mu.diff.from.mu.irr(u,effect.a.irr), theta=theta),
Y_Z_2 = rnegbin(n=N,mu= u + mu.diff.from.mu.irr(u,effect.b.irr), theta=theta)
) +
declare_assignment(num_arms = 3,
conditions = c("0", "1", "2"),
assignment_variable = Z) +
declare_estimand(ate_Y_1_0 = log(effect.a.irr)) +
declare_estimand(ate_Y_2_0 = log(effect.b.irr)) +
declare_estimand(ate_Y_2_1 = log(effect.a.b.irr)) +
declare_reveal(assignment_variables = Z) +
declare_estimator(formula = Y ~ Z, estimand="ate_Y_1_0", subset = data$Z %in% c(0,1), model = glm.nb, label="Y_1_0.negbin") +
declare_estimator(formula = Y ~ Z == 2, estimand="ate_Y_2_0", subset = data$Z %in% c(0,2) , model = glm.nb, label="Y_2_0.negbin") +
declare_estimator(formula = Y ~ Z == 2, estimand="ate_Y_2_1", subset = data$Z %in% c(1,2) , model = glm.nb, label="Y_2_1.negbin")
Appendix: About My System Configurations
The only different package is repr 0.17 on OSX vs repr 0.18 on Ubuntu
Ubuntu Machine:
R version 3.5.1 (2018-07-02)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 16.04.4 LTS
Matrix products: default
BLAS: /usr/lib/libblas/libblas.so.3.6.0
LAPACK: /usr/lib/lapack/liblapack.so.3.6.0
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] repr_0.18 DeclareDesign_0.12.0 estimatr_0.14
[4] fabricatr_0.6.0 randomizr_0.16.1 skimr_1.0.3
[7] forcats_0.3.0 stringr_1.3.1 purrr_0.2.5
[10] readr_1.2.1 tidyr_0.8.2 tibble_1.4.2
[13] tidyverse_1.2.1 rlang_0.3.0.1 ggplot2_3.1.0
[16] MASS_7.3-51.1 dplyr_0.7.8
OSX Machine:
R version 3.5.1 (2018-07-02)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS 10.14.1
Matrix products: default
BLAS: /Library/Frameworks/R.framework/Versions/3.5/Resources/lib/libRblas.0.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.5/Resources/lib/libRlapack.dylib
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] repr_0.17 DeclareDesign_0.12.0 estimatr_0.14
[4] fabricatr_0.6.0 randomizr_0.16.1 skimr_1.0.3
[7] forcats_0.3.0 stringr_1.3.1 purrr_0.2.5
[10] readr_1.2.1 tidyr_0.8.2 tibble_1.4.2
[13] tidyverse_1.2.1 rlang_0.3.0.1 ggplot2_3.1.0
[16] MASS_7.3-51.1 dplyr_0.7.8