stan
stan copied to clipboard
Feature request: Add timestamps to sample progress
When running Stan, I like to follow the sampling output progress; for example:
...
Chain [1] Iteration: 500 / 1500 [ 33%] (Warmup)
Chain [1] Iteration: 501 / 1500 [ 33%] (Sampling)
...
It would nice if such sampling came with timestamps so that I could get a definitive sense for how long things are taking. Augmenting the above output, perhaps something like:
...
(Wed Sep 21 10:27:52 2011) Chain [1] Iteration: 500 / 1500 [ 33%] (Warmup)
(Wed Sep 21 12:18:45 2011) Chain [1] Iteration: 501 / 1500 [ 33%] (Sampling)
...
Does that seem reasonable? (I am new to the Stan development community, so if not I understand.)
Might the update come somewhere in here(?):
https://github.com/stan-dev/stan/blob/d4c6ae376bba1b754b7e7cdbe9269dce93958065/src/stan/services/util/generate_transitions.hpp#L58-L65
I like this idea, but we're very cautious about changing any of the output formats because things (like CmdStanPy's progress bars) are set up to parse the output and they might do so in very brittle ways.
Makes sense. Thanks for considering!
I'm reopening this issue. @mitzimorris: This is just changing things in the messages that get overwritten on top of each other, not the output format of CmdStan. Presumably nothing is parsing that transient output.
I have really liked the JAX output which gives you elapsed time and estimated time remaining. And it's super cool---it just need a simple annotation around a loop, here @scan_tqdm(num_draws) which is parameterized by the number of iterations in the loop.
@scan_tqdm(num_draws)
def reducer(theta, draw):
theta_star, accept = transition(keys[draw], logpdf, theta, d,
grid_size=grid_size, step_size=step_size)
return theta_star, (theta_star, accept)
_, draw_accepts = jax.lax.scan(reducer, theta, jnp.arange(num_draws))
Then when you run, it looks like this:
Running for 100,000 iterations: 5%|█ | 5000/100000 [00:16<05:04, 311.70it/s]
This says it's run 5000 of 100,000 iterations, which took 16 seconds at 311 iterations/second and is expected to run for another 5 minutes and 4 seconds.
I believe CmdStanPy does parse the transient output to create transient progress bars. Even so, this is possible as long as it's not the default behavior. You could add a new command line argument that enables the timestamps.
I don't think CmdStan has the kinds of transient/overwriting progress bars the way that CmdStanPy does. I tried that initially with CmdStan but couldn't manage to make it portable.
Can't we just change CmdStanPy or is someone or something parsing its progress bar output?
We can definitely change cmdstanpy if needed
I have a little code from another project to get lower resolution timestamps. Though imo the date format is nicer for a logger. I like how jax does the total time so far. I kind of worry about the estimate to completion since we can go very slow in the beginning and much faster later.
inline void local_time(const time_t* timer, struct tm* buf) noexcept {
#ifdef _WIN32
// Windows switches the order of the arguments?
localtime_s(buf, timer);
#else
localtime_r(timer, buf);
#endif
}
/* Get the current time with microseconds */
inline std::string time_mi() noexcept {
auto now = std::chrono::system_clock::now();
time_t epoch = std::chrono::system_clock::to_time_t(now);
struct tm tms{};
::riccati::local_time(&epoch, &tms);
auto fractional_seconds = now - std::chrono::system_clock::from_time_t(epoch);
int micros = std::chrono::duration_cast<std::chrono::microseconds>(fractional_seconds).count();
// Format the time string
char buf[sizeof "[9999-12-31 29:59:59.999999]"];
size_t nb = strftime(buf, sizeof(buf), "[%Y-%m-%d %H:%M:%S", &tms);
nb += snprintf(&buf[nb], sizeof(buf) - nb, ".%06d]", micros);
// Return the formatted string
return std::string(buf, nb);
}