cem Factor Treatments in Logits

Factor Treatments in Logits

Open lotitomaria opened this issue 2 years ago • 2 comments

I'm writing because my co-authors and I are using the cem package in R, attempting to fit a factor variable treatment with 5 categories to a dichotomous dependent variable. (Thank you, by the way, for this amazing resource!) The cem() function runs fine, but the att() function returns an error message that we wanted to ask you about. The message reads:

Error: variable 'LatentClass' was fitted with type "factor" but type "numeric" was supplied In addition: Warning messages: 1: In eval(family$initialize) : non-integer #successes in a binomial glm! 2: In model.frame.default(Terms, newdata, na.action = na.action, xlev = object$xlevels) : variable 'LatentClass' is not a factor

We've confirmed many times in str() that our treatment is indeed a factor. Then we found this post on github claiming that this sort of model specification can't word (https://github.com/IQSS/cem/issues/2):

"When you create a cem object using a factor variable as a treatment, attempting to use att to run a logistic regression on that object fails. It looks like this is happening because of line 372 (using the GitHub formatting) of the att command, tmp.data[, obj$treatment] <- 0. Assigning 0 to the factor variable changes the variable to a numeric, and then the prd <- predict(out, tmp.data, type = "response") command on the following line fails because the treatment variable is the wrong type. To fix this, you might want to change the assignment on line 258 to assign the reference level of the factor variable if the treatment variable is a factor. Alternatively, you could just coerce everything to numeric, or throw a warning if you try to run att with a factor treatment."

Is it true that cem cannot handle factor variable treatments in a logistic model? If so, do you have a recommended course of action?

Stata also seems to struggle with factor variable treatments with more than one category. The cem command does not generate the weight variable (cem_weights). We've confirmed that when transforming the treatment variable to binary, we get the appropriate cem_weights and can run the analysis. Below is some pasted code in R and Stata:

In R:

str(matching.df) #Coarsen fatalities fat.grp <- list(c("0","1"), c("2", "3"), c("4"), c("5","6")) #Coarsen Polyarchy hist(matching.df$s_polyarchy) polycut <- c(0 , .2, .45, .8, 1)

#matching.df$LatentClass = as.numeric(as.character(matching.df$LatentClass)) str(matching.df) summary(matching.df) mat <- cem(treatment = "LatentClass", data = matching.df, grouping = list(fatalities_range=fat.grp), cutpoints = list(s_polyarchy = polycut), eval.imbalance = TRUE, drop = "Enable", baseline.group = "3") mat results <- att(mat, Enable ~ LatentClass, data = matching.df, model = "logistic")

In Stata:

*Stata can't run this command with the multilevel treatment: imbalance indiscrim fatalities_range camp_size ab_internat s_polyarchy, treatment(LatentClass)

*Stata doesn't generate weights with the multilevel treatment: recode fatalities_range (0 1 = 1) (2 3 = 2) (4 = 3) (5 6 = 4), generate(fatalities) cem indiscrim fatalities (#0) camp_size ab_internat s_polyarchy (0 , .2, .45, .8, 1), treatment(LatentClass)

Aug 30 '22 20:08 lotitomaria

cem cem copied to clipboard

Factor Treatments in Logits

cem
cem copied to clipboard