stan icon indicating copy to clipboard operation
stan copied to clipboard

Don't require two Stan programs to use standalone GQs (add a flag to services to turn off GQ block)

Open jgabry opened this issue 5 years ago • 4 comments

Summary:

This came up when @rok-cesnovar and I were discussing the PR to add standalone generated quantities to CmdStanR:

Using standalone generated quantities requires the user to keep two versions of their stan program, one without GQs (for MCMC) and one with GQs (for standalone GQs). This seems both unnecessary and error-prone (having to make edits in two places).

I propose we add a flag in the services that would run the Stan program without generated quantities.

Current Version:

v2.23.0

jgabry avatar Jun 30 '20 18:06 jgabry

the problem is this: class stan/services/util/mcmc_writer.hpp has function write_sample_params - relevant part is here: https://github.com/stan-dev/stan/blob/53fd0144fdcb769e0eda44e968c3672af4f2dd90/src/stan/services/util/mcmc_writer.hpp#L97-L112

is always writes out parameters, transformed parameters, and generated quantities all in one go - note args hardcoded to true, true (line 111)

you would need to add 1 or 2 args to set of sampler service methods and insure that these are passed along to the generate_transitions method which calles write_array

relevant files are:

src/stan/util/run_sampler.hpp
src/stan/util/run_adaptive_sampler.hpp
src/stan/util/generate_transitions.hpp
src/stan/sample/fixed_param.hpp

(writing this is giving me serious deja vu)

mitzimorris avatar Nov 03 '20 18:11 mitzimorris

this use case would be a good one to use when working through proposals for better/more fine-grained I/O.

mitzimorris avatar Nov 03 '20 19:11 mitzimorris

Thanks @mitzimorris. So the idea would then be to run generate quantities but not write them out? Or am I misunderstanding (quite possible)? I was thinking that we could just ignore the whole block so they don't get computed at all.

jgabry avatar Nov 03 '20 22:11 jgabry

So the idea would then be to run generate quantities but not write them out?

the generated C++ code (i.e., stan.hpp) doesn't exactly line up with the named block structure of a Stan program, hence the confusion here - write_array writes parameter and transformed parameter values as well as running generated quantities block code and writing results.

you can pass in a flag which tells write_array to return instead of running generated quantities block.

let's make this concrete - here's the example bernoulli.stan code with addition of ppc checks in generated quantities block:

data { 
  int<lower=0> N; 
  int<lower=0,upper=1> y[N];
} 
parameters {
  real<lower=0,upper=1> theta;
} 
model {
  theta ~ beta(1,1);
  y ~ bernoulli(theta);
}
generated quantities {
  int y_rep[N];
  for (n in 1:N)
    y_rep[n] = bernoulli_rng(theta);
}

here's the generated write_array function which will:

a) transform theta back to constrained parameter space, add to vector vars__ - lines 24-31

b) (do the same transformed parameters - in the ex, nothing to do) - not executed is both boolean flags are false

c) compute generated quantities and write them out - lines 39-48 - not executed if 2nd boolean flag is false

     1	  template <typename RNG>
     2	  inline void write_array(RNG& base_rng__, std::vector<double>& params_r__,
     3	                          std::vector<int>& params_i__,
     4	                          std::vector<double>& vars__,
     5	                          bool emit_transformed_parameters__ = true,
     6	                          bool emit_generated_quantities__ = true,
     7	                          std::ostream* pstream__ = nullptr) const {
     8	    using local_scalar_t__ = double;
     9	    vars__.resize(0);
    10	    stan::io::reader<local_scalar_t__> in__(params_r__, params_i__);
    11	    static const char* function__ = "bernoulli_ppc_model_namespace::write_array";
    12	(void) function__;  // suppress unused var warning
    13	
    14	    (void) function__;  // suppress unused var warning
    15	
    16	    double lp__ = 0.0;
    17	    (void) lp__;  // dummy to suppress unused var warning
    18	    stan::math::accumulator<double> lp_accum__;
    19	    local_scalar_t__ DUMMY_VAR__(std::numeric_limits<double>::quiet_NaN());
    20	    (void) DUMMY_VAR__;  // suppress unused var warning
    21	
    22	    
    23	    try {
    24	      double theta;
    25	      theta = std::numeric_limits<double>::quiet_NaN();
    26	      
    27	      current_statement__ = 1;
    28	      theta = in__.scalar();
    29	      current_statement__ = 1;
    30	      theta = stan::math::lub_constrain(theta, 0, 1);
    31	      vars__.emplace_back(theta);
    32	      if (logical_negation((primitive_value(emit_transformed_parameters__) ||
    33	            primitive_value(emit_generated_quantities__)))) {
    34	        return ;
    35	      } 
    36	      if (logical_negation(emit_generated_quantities__)) {
    37	        return ;
    38	      } 
    39	      std::vector<int> y_rep;
    40	      y_rep = std::vector<int>(N, std::numeric_limits<int>::min());
    41	      
    42	      current_statement__ = 4;
    43	      for (int n = 1; n <= N; ++n) {
    44	        current_statement__ = 3;
    45	        assign(y_rep, cons_list(index_uni(n), nil_index_list()),
    46	          bernoulli_rng(theta, base_rng__), "assigning variable y_rep");}
    47	      for (int sym1__ = 1; sym1__ <= N; ++sym1__) {
    48	        vars__.emplace_back(y_rep[(sym1__ - 1)]);}
    49	    } catch (const std::exception& e) {
    50	      stan::lang::rethrow_located(e, locations_array__[current_statement__]);
    51	      // Next line prevents compiler griping about no return
    52	      throw std::runtime_error("*** IF YOU SEE THIS, PLEASE REPORT A BUG ***"); 
    53	    }
    54	    } // write_array() 

mitzimorris avatar Nov 04 '20 16:11 mitzimorris