Hello all,

Please see attached issue from the brms tracker. https://github.com/paul-buerkner/brms/issues/1473

The TLDR:

When using brms's variational method, brms gathers the available variables from the metadata field of read_cmdstan_csv
Among these, is lp_approx__
Brms then calls csfit <- cmdstanr::read_cmdstan_csv(files = files, variables = variables, sampler_diagnostics = sampler_diagnostics, format = NULL) within its function: read_csv_as_stanfit
An error occurs: Error: Can't find the following variable(s) in the output: lp_approx__

I have not dug into the underlying cause, but I suspect the actual csv contains something like lp_p__ and lp_q__, and not lp_approx__, and an error is throw when lp_approx__ does not yet exist (it has not been renamed from lp_q__) ?

Versions

brms_2.19.0 cmdstanr_0.5.3 Ubuntu 20.04 R 4.2.2

Mar 29 '23 23:03 stephensrmmartin

Ah ha, yes this seems to be the issue.

Browse[2]> 
debug: res <- matching_variables(variables, repair_variable_names(metadata$variables))
Browse[2]> 
debug: if (length(res$not_found)) {
    stop("Can't find the following variable(s) in the output: ", 
        paste(res$not_found, collapse = ", "), call. = FALSE)
}

In this case, variables contains:

Browse[2]> variables
 [1] "lp__"               "lp_approx__"        "b_logalpha"        
 [4] "b_personEffect"     "sd_1"               "sd_2"              
 [7] "sd_3"               "r_1_personEffect_1" "r_2_personEffect_1"
[10] "r_2_personEffect_2" "r_3_difficulty_1"   "lprior"            
[13] "cor_2"

Within metadata$variables:

Browse[2]> head(metadata$variables)
[1] "lp__"             "log_p__"          "log_g__"          "b_logalpha.1"    
[5] "b_personEffect.1" "sd_1.1"

Which causes the not_found to trip:

Browse[2]> res$not_found
[1] "lp_approx__"

Mar 29 '23 23:03 stephensrmmartin

Ok, confirmed this I believe.

On line 211 of https://github.com/stan-dev/cmdstanr/blob/master/R/csv.R#L211

it checks for the existence of the provided variables within the metadata$variables of the read csv file. At this stage, lp_approx__ does not exist, it is lp_g__. Therefore it fails.

Later in the file, lp_g__ is renamed to lp_approx__. https://github.com/stan-dev/cmdstanr/blob/master/R/csv.R#L306

This leads to an inconsistency: If you use read_cmdstan_csv, and save the variables into a vector, it will include lp_approx__. If you feed that same vector into read_cmdstan_csv, it will fail, because it does not see lp_approx__ in the metadata prior to the renaming.

I would suggest either adding an exception into the metadata check for possible lp variables, or moving the metadata renaming up prior to the variable check.

Mar 30 '23 00:03 stephensrmmartin

I just created a quick fix on a fork: https://github.com/stephensrmmartin/cmdstanr/commit/9e0cda554df3a7fc6d0433df2e2571ae1c42bb5c

I imagine there's a better way of handling that, but I didn't want to overengineer a quick fix for my personal usecase. I imagine a better approach could be:

Adding an optional "ignore_variables" or "pass_variables" argument to the utils::matching_variables function, which would be a more generic way of ignoring mismatches for anything in that list.
Moving the renaming logic before the check.

Edit: Note something else is broken; this fix is incomplete. I'll hack on it and get back once I have a functioning patch.

Edit2: I tried another approach (https://github.com/stephensrmmartin/cmdstanr/blob/hotfix/lp_approx__not_found_alt1/R/csv.R#L200), which avoids the same error.

However, either approach to fixing the lp_approx__ not found leads to downstream errors in brms. I suspect the downstream bug is due to how I am resolving it, but I don't know a better way to resolve it. The downstream bug winds up being because there are (P) variables in the samples, but (P+2) variables in the names of variables; consequently, when brms re-orders the samples data.frame, there's an indexing error (undefined columns selected). The two additional variables are lp__ and lp_approx__.

To me, it seems like there is a desync somewhere between what is a variable and what is a column name. Alternatively put, there's an inconsistency in listed variables (being non-diagnostic, but tracked columns in samples) and in model variables (things the user explicitly list as a variable to be tracked), such that one source says there are P variables and another says there are P+2 variables, and that perhaps this has changed at some point in the last year.

I have no idea where or how that would occur, or whether this is a brms or a cmdstanr bug, but it seems to break VI in brms.

Mar 30 '23 00:03 stephensrmmartin

Update, this appears to be a bug introduced in brms at commit: https://github.com/paul-buerkner/brms/commit/34203cc9bd34f7171a20671ad963440f91765cd4

VI in Brms 2.17 works with the latest cmdstanr and cmdstan; I bisected it to brms 2.19, and that commit is where VI breaks.

However, I suspect this could be considered a bug of either cmdstanr or brms, since brms is simply trying to grab the variable names from cmdstanr.

Mar 31 '23 23:03 stephensrmmartin

cmdstanr cmdstanr copied to clipboard

Error: Can't find the following variable(s) in the output: lp_approx__

Versions

cmdstanr
cmdstanr copied to clipboard