pymer4
pymer4 copied to clipboard
Strange issue related to variable names and lmer
I am experiencing an issue:
Traceback (most recent call last):
File "E:/Users/USER/PycharmProjects/PROJECT/sentiment_anal.py", line 297, in <module>
do_lmer(df_all)
File "E:/Users/USER/PycharmProjects/PROJECT/sentiment_anal.py", line 196, in do_lmer
summary = mod.fit(REML=True)
File "C:\Users\USER\Anaconda3\lib\site-packages\pymer4\models\Lmer.py", line 539, in fit
out_summary, out_rownames = estimates_func(self.model_obj)
File "C:\Users\USER\Anaconda3\lib\site-packages\rpy2\robjects\functions.py", line 198, in __call__
return (super(SignatureTranslatedFunction, self)
File "C:\Users\USER\Anaconda3\lib\site-packages\rpy2\robjects\functions.py", line 125, in __call__
res = super(Function, self).__call__(*new_args, **new_kwargs)
File "C:\Users\paulc\Anaconda3\lib\site-packages\rpy2\rinterface_lib\conversion.py", line 45, in _
cdata = function(*args, **kwargs)
File "C:\Users\paulc\Anaconda3\lib\site-packages\rpy2\rinterface.py", line 680, in __call__
raise embedded.RRuntimeError(_rinterface._geterrmessage())
**rpy2.rinterface_lib.embedded.RRuntimeError: Error in data.frame(..., check.names = FALSE) :
arguments imply differing number of rows: 2, 0**
This comes about from this patch of code:
formula = f'M_comp ~ 1 + entropy_sign + (1 + entropy_sign | subr)'
mod = Lmer(formula, data=df_vars)
summary = mod.fit(REML=True)
Strangely, the issue seems to come from the name of the dataframe's columns. I provide two pictures below that show what I mean. Changing one of the columns from "entropy_sign" to "PE" makes the error disappear. I think I also vaguely pinpointed the error by looking at this patch of code in lmer.py:
rstring = (
"""
function(model){
out.coef <- data.frame(unclass(summary(model))$coefficients)
out.ci <- data.frame(confint(model,method='"""
+ conf_int
+ """',nsim="""
+ str(n_boot)
+ """))
n <- c(rownames(out.ci))
idx <- max(grep('sig',n))
print('idx = ')
print(idx)
# There is some error here where when I analyze "entropy_sign" i need to add idx = 4 (it is 6 otherwise)
out.ci <- out.ci[-seq(1:idx),]
out <- cbind(out.coef,out.ci)
list(out,rownames(out))
}
"""
)
Here are the examples:
Does not work:
Works:
This issue isn't really hampering me anymore since I can just change the variable names, although it may be of interest (and if this is a genuine bug, I don't know enough R or rpy2 to actually fix this myself and just contribute a fix)
Hmm bizarre. I haven't run into this before and I don't seem to be able to reproduce it just by changing the names of the variables in the included sample data (i.e. adding an _
). It seems to be happening on the R side of things too. @paulcbogdan can you share platform details, i.e. what OS, Python version, R version, pymer version, etc?
If you installed via conda then conda list
should print package versions out.
Also have you tried running the model directly in R? Or using the confint
function from R directly? I'm trying to figure out if the issue is with the conversion using rpy2
or within R itself.
Closing due to time lapse. Feel free to reopen if encountered again.