DeclareDesign

Measurement model for the dependent variable


#1

Are there any examples of how to create a measurement model for the dependent variable?
Specifically, I’m measuring a latent variable using, say, 10 Likert items. These are supposed to be predicted in a multilevel model. I know I can add noise to the relationship, but how I can I e.g. turn the latent variable into a Likert item?

Is something like this the intended way? It seems to work, but it seems to mix two steps.

declare_estimator(
    draw_likert(Y) ~ midcycle,
    estimand = estimands_regression,
    model = lm,
    term = TRUE
  )

#2

Generally, I would shim a step ahead of the estimator to discretize the latent variable, eg:

... + 
  declare_population(Y_likert = draw_likert(Y)) + 
  declare_estimator(
  Y_likert~midcycle, 
  estimand = estimands_regression, 
  model=lm, 
  term=TRUE)

Can you post your full design?


#3

Edit: Sorry, just saw the response to my other question. Ok, I can declare everything in declare_population, but then I sort of lose out on many of the benefits of DeclareDesign, don’t I? That’s what I was doing before.
At least, it seems everything gets more jumbled.

Old

But I’ve specified Y in declare_potential_outcomes. declare_population comes after, right?

Simplified full design

estimands_regression <- declare_estimand(
    `midcycle` = mean(Y_midcycle_1 - Y_midcycle_0),
    term = TRUE,
    label = "Regression_Estimands"
  )
design <-
  # simulate data
  declare_population(
    obs = add_level(N = 100,
                     midcycle = draw_binary(N, prob = 0.2),
                     noise = rnorm(N)
    )
  ) +
  # simulate real relationship
  declare_potential_outcomes(Y ~ 0.5 * midcycle + noise) +

  declare_assignment(m = 50) +
    # simulate how we estimate relationship
  declare_estimator(
    draw_likert(Y) ~ midcycle,
    estimand = estimands_regression,
    model = lm,
    term = TRUE
  )

design

#4

Here is a fixed design with some comments re the above

design <-
  
  # simulate data
  
  declare_population(
    
    obs = add_level(N = 100,
                    
                    noise = rnorm(N)
                    
    )
    
  ) +
  
  # simulate real relationship
  
  declare_potential_outcomes(Y ~ draw_ordered(0.5 * midcycle + noise, breaks = -3:3), assignment_variables = "midcycle") +
  
  estimands_regression +
  
  declare_assignment(prob=.2, assignment_variable = "midcycle") +
  
  # simulate how we estimate relationship
  # Technically this can be create automatically
  #declare_reveal(outcome_variables = "Y", assignment_variables = c("midcycle")) +

  declare_estimator(
    
    Y ~ midcycle,
    
    estimand = estimands_regression,
    
    model = lm,
    
    term = TRUE
    
  )

In your design, where PO and assn were both using the default Z assignment variable, (and Z was unused in the PO formula), those two steps were merely creating unused variables Y_Z_1, Y_Z_0 and Z - they weren’t doing anything helpful anyway.

In my edits, I moved midcycle to the assnment step, and it seems like things work; also added the draw_ordered to the PO step (draw_likert didn’t work for me bc it created text?) which seems like a more natural place for it. I also set the assn variable on the two steps explicitly to midcycle.