DeclareDesign Community

Conditions code

I have what is probably a simple problem that I can’t figure out. I am randomly assigning half of all units within a block (district) to receive treatment in 1 of 2 conditions: either in the k-2 band or the 3-5 band. The following code accomplishes this just fine, but the resulting mfa_tx variable contains values of ‘1’ and ‘2’, instead of the values in the conditions=statement.

post_blk_rand <- within(pre_rand_data,{mfa_tx <- block_ra(blocks = district, block_m = block_count, conditions = c(“k-2 mfa”,“3-5 mfa”))
id_var <- 1:nrow(pre_rand_data)
})

However, if i don’t use the block_m statement but instead simply use prob=.5, the resulting field appears with the labels i have defined. Can someone help me understand why i can’t generate the file with the condition labels i desire?

Thanks,

Jason (new to the group)

Hi Jason, welcome!

I’m not getting the same results:

blocks <- rep(c("A", "B","C"), times = c(50, 100, 200))
Z <- block_ra(blocks = blocks, block_m = c(20, 30, 40), conditions = c("control", "treatment"))
table(Z, blocks)

           blocks
Z             A   B   C
  control    30  70 160
  treatment  20  30  40

Z <- block_ra(blocks = blocks, prob = 0.5, conditions = c("control", "treatment"))
table(Z, blocks)

           blocks
Z             A   B   C
  control    25  50 100
  treatment  25  50 100

Is it possible that the conditions argument was left out when you switched to prob?

Hi Alex,
Thanks for the quick reply. The version I ran using prob=.5 works as intended and as your example does. But when I switch to using the block_m argument, instead of the labels is assigns ‘1’ and ‘2’ to the treatment variable.

Is it possible that it might be due to some feature of my data?

Here’s the code I use to generate the block_count vector, and my call to block_ra:
#using the block_m function denoting number of units in each block to assign to tx
#first generate vector for block_m containing count of units in each block
block_count <- pre_rand_data %>% count(district)
#divide by 2 to split evenly
block_count$n2 <- ceiling(block_count$n / 2)
block_count <- pull(block_count, n2)
block_count

#now ra within block using split counts
post_blk_rand <- within(pre_rand_data,{mfa_tx <- block_ra(blocks = district, block_m = block_count, conditions = c(“k-2 mfa”,“3-5 mfa”))
id_var <- 1:nrow(pre_rand_data)
})
post_blk_rand

Argh, I’m still not able to reproduce your bug!


library(tidyverse)
library(randomizr)
pre_rand_data <- tibble(district = rep(c("A", "B","C"), times = c(50, 100, 200)))

block_count <- pre_rand_data %>% count(district)
#divide by 2 to split evenly
block_count$n2 <- ceiling(block_count$n / 2)
block_count <- pull(block_count, n2)
block_count

#now ra within block using split counts
post_blk_rand <-
  within(pre_rand_data, {
    mfa_tx <-
      block_ra(
        blocks = district,
        block_m = block_count,
        conditions = c("k - 2 mf", "3 - 5 mf")
      )
    id_var <- 1:nrow(pre_rand_data)
  })
head(post_blk_rand)

for me yields

> head(post_blk_rand)
# A tibble: 6 x 3
  district id_var mfa_tx  
  <chr>     <int> <fct>   
1 A             1 3 - 5 mf
2 A             2 k - 2 mf
3 A             3 k - 2 mf
4 A             4 k - 2 mf
5 A             5 3 - 5 mf
6 A             6 3 - 5 mf

I’m using randomizr version 0.18.0, which is the most recent on CRAN. Sorry you’re having trouble!

It’s possible that there’s something about your data that causing the problem. Can you reproduce your problem with the fake data I made above?

Well this is most frustrating, but i appreciate your help. When i run your code it works perfectly. Here’s a link to the Excel file with the data i’m working with:
file link

#import excel
#my_data <- read_excel(“my_file.xlsx”)
#read.xlsx version uses different package
pre_rand_data <- read_excel(file.choose())
#pre_rand_data <- read.xlsx((file.choose()), sheetName=“Sheet1”)
pre_rand_data

#using the block_m function denoting number of units in each block to assign to tx
#first generate vector for block_m containing count of units in each block
block_count <- pre_rand_data %>% count(district)
#divide by 2 to split evenly
block_count$n2 <- ceiling(block_count$n / 2)
block_count <- pull(block_count, n2)
block_count

#now ra within block using split counts
post_blk_rand <- within(pre_rand_data,{mfa_tx <- block_ra(blocks = district, block_m = block_count, conditions = c(“k-2 mfa”,“3-5 mfa”))
id_var <- 1:nrow(pre_rand_data)
})
post_blk_rand

Hi Alex,

Well, i think i figured out what’s causing the issue (though i don’t know the intricacies of why). For the project I’m working on, we are splitting schools by grade bands: k-2 and 3-5. We are aiming to randomly assign schools to either (a) receive treatment in their k-2 band (which means their 3-5 band serves as a control) or (b) receive treatment in their 3-5 band (which means their k-2 band serves as a control.

We’re engaged in recruiting, and at the moment we have a single school in its own district. So i modified your small simulated data creation to reflect the 4 districts that we have, where district “roe48” has only a single school:

pre_rand_data <- tibble(district = rep(c(“cps”,“roe47”,“roe48”,“roe49”), times = c(30, 27, 1, 2)))

When i use block_ra with prob=.5, it labels the conditions just fine. But when i use the block_m command…it then gives me the “1” and “2” output i reference in my original post. My best guess is something with the internal logic used under block_m processing doesn’t like that singleton, but that’s a naiive guess.

For now, i simply used an ordered function and applied labels after-the-fact. Thanks a bunch for helping me out. At least now i understand why it wasn’t working!

1 Like

@Jason_Schoeneberger - thank you for helping us isolate and find the bug. This will get fixed in the next version. Perhaps an even easier work around for now is to make conditions a factor:

conditions = factor(c(“k-2 mfa”,“3-5 mfa”))

@Alex_Coppock - I’ve fixed this bug in the following PR - please review and merge at your convenience - https://github.com/DeclareDesign/randomizr/pull/83 - I believe this bug also happens when conditions is logical, so I’ve adjusted that as well.

v/r

Neal

1 Like

Wonderful, thanks to you both for finding and fixing this. #opensource FTW!

Well awesome…glad i could help!