fixest
fixest copied to clipboard
Saving fixest objects
Is there a way to save a fixest object as an .RDS file so that I can use it later with all the possible fitstat options?
Say I want to save this fixest object:
library(fixest)
library(haven)
df = read_dta("http://dss.princeton.edu/training/Panel101.dta")
outcome_vars = c("y", "y_bin")
treatment_vars = c("x1")
controls = c("country", "year")
mod = feols(
.[outcome_vars]
~
.[treatment_vars]
|
.[controls]
,
data = df,
cluster = ~country,
panel.id = ~country + year,
fixef.rm = "none"
)
etable(list(mod$y, mod$y_bin), fitstat = ~ . + my)
# This reports the regression results as expected
saveRDS(mod, "test.RDS")
When I try to access it later, I can't use most fitstat options, e.g., my:
rm(list = ls()) # or new session
newmod = readRDS("test.RDS")
fixest::etable(list(newmod$y, newmod$y_bin), fitstat = ~ . + my)
# This gives the following error:
# Error in model.matrix.fixest(x, type = "lhs") :
# The argument 'data' must be a data.frame or a matrix.
Side Note: It seems to work as expected if I load the same dataset again in the new session, like this:
rm(list = ls()) # or new session
df = haven::read_dta("http://dss.princeton.edu/training/Panel101.dta")
newmod = readRDS("test.RDS")
fixest::etable(list(newmod$y, newmod$y_bin), fitstat = ~ . + my)
Thanks for your help! I'm a huge fan of the package.
Hi, I am not 100% sure what fixest is doing in detail, but I am fairly certain that you have basically answered your question: fixest does not store all input objects in the model object - if all objects of type fixestwould have to carry their input data sets along, one might pretty quickly run out of memory. What happens instead is that fixest stores information on the model call and its environment in object$call and object$call_env, and then fetches the input data from the respective environment whenever it is needed. This implies that loading the data before reading the .rds file should be safe, provided it is in exactly the same shape as when estimating the fixest model.
Yes, I figured it was something like that.
Is there a way to have the fixest object include those auxiliary objects so it can be saved and reused?
Thanks so much!
I don't think there is inbuilt functionality, but I might be mistaken. One simple workaround would be to assign the fixest object and the associated data to a list, and to save that list as an .rds file?
Hi everyone : this is currently not possible. Similar to #340. You have hacks to do it but they are not straightforward (one such hack is described in #340).
There is work under way to solve this problem. It will be there for sure, but not before Jan/Feb, sorry!
hi! I got the same issue, is there a way I can take this issue and solve it in a PR before next monday?
Hi @lrberge
I was testing some options, one could be to pass the training data in the exported object, like this https://github.com/pachadotdev/eflm/blob/main/R/eglm.R#L202, and then compute fit statistics by calling fit$data as default. What do you think?
I started to mimic some glm() behaviour, but with the difference that the user needs to specify the option to put the training dataset in the returned object
https://github.com/pachadotdev/fixest2/commit/973e5eca3ae544ccef9032af5b02d04b3e3880b8
this is not yet ready, when it works well, I'll put the changes in a new branch and send a PR
Hi @pachadotdev, please don't :-) I've started to work this out a while ago and I made some major overhauls linked to this issue. There's no need for the PR. I kind of have a "research" semester starting in February so I'll finish this business at that time.
Hi @pachadotdev, please don't :-) I've started to work this out a while ago and I made some major overhauls linked to this issue. There's no need for the PR. I kind of have a "research" semester starting in February so I'll finish this business at that time.
sure, I'll email you
Hi, with a huge delay, note that there's the data.save argument which, if TRUE, will lead to consistent results as in the initial post.
Note though that it creates a copy of the full original data set, so it's handy only for small data sets.
Hi, with a huge delay, note that there's the
data.saveargument which, ifTRUE, will lead to consistent results as in the initial post. Note though that it creates a copy of the full original data set, so it's handy only for small data sets.
dear @lrberge
sorry the delay, i had a big surgery and i'm typing with 1 hand
glad to see that some parts of my old pr are somehow reflected here, this is amazing
On the OP, there's a massive overhaul on how fitstats work enabling an effective save of an estimation at the smallest size. But it's still WIP.