drake Simulator's real time rate API

I've been using a simple pattern, e.g.: https://github.com/RussTedrake/underactuated/blob/547b39b41/src/acrobot/balancing_lqr.py#L83 to run multiple simulations.

I only just now realized that the realtime rate was doing nothing after the first trial (since the initialization time wasn't being reset).

minimal request: add Simulator::ResetStatistics to the python bindings. I don't think it's there yet? potential request: seems like we should somehow realize/warn when this is failing silently. and perhaps even have the initialization time get reset more cleverly?

EDIT(eric): Updated link to use commit.

Feb 26 '19 16:02 RussTedrake

IIRC the Simulator API expectation is that you must call Simulator::Initialize again, between trials. If so, that seems like a fine place to reset the time (and indeed, it already does so).

Feb 26 '19 17:02 jwnimmer-tri

i see. this works:

    context.set_time(0.)
    simulator.Initialize()

but this (which I had tried) does not:

    simulator.Initialize()
    context.set_time(0.)

Feb 26 '19 17:02 RussTedrake

Yes, the manual Initialize should immediately precede the StepTo.

Feb 26 '19 17:02 jwnimmer-tri

I'm trying to figure out what would be the right fix for this, which I feel didn't work as the user (Russ) expected. The current documentation for Initialize() has this note:

Note If you make a change to the Context or to Simulator options between StepTo() calls you should consider whether to call Initialize() before resuming; StepTo() will not do that automatically for you. Whether to do so depends on whether you want the above initialization operations performed. For example, if you changed the time you will likely want the time-triggered events to be recalculated in case one is due at the new starting time.

It doesn't specifically mention the real time rate, but in any case it's not obvious to me that a user would know to go look in the Simulator::Initialize() documentation anyway. Is there a way to make the behavior better so it would just work? It would be possible to note whether the Context had been changed since last StepTo() and invoke Initialize() automatically but I'm not sure that's the right behavior in other scenarios (might be?). Thoughts?

/cc @edrumwri

Feb 26 '19 21:02 sherm1

@sherm1 and I discussed. Here are our thoughts:

Simulator::StepTo(.) should warn the user (message to stderr or an exception) when time has gone backwards, because Initialize(.) should almost always be called in such cases. We decided against silently doing this operation automatically because certain side effects (like publishing through initialization events) could get triggered.

We could fix the real-time rate a couple of ways. We could introduce a forgetting factor into the real-time rate (I outlined such an approach previously, and it worked well, but we had no real impetus to push it through the PR process). Or, the real-time rate limiter could detect when time had regressed and silently correct itself (much less dangerous here than in the case above).

Assuming that @RussTedrake and @jwnimmer-tri have no disagreements with the above, we can create separate issues for both of these and get fixes in place soon.

Feb 26 '19 21:02 edrumwri

I think data-dependent printing to the console is one of the least desirable solutions to any kind of problem, including this one. (Warn-once prints about deprecated codepaths are sad, but I can live with them.) You end up needing to rate-limit the message, so that the user doesn't overwhelm either their log tolerance or their CPU throughput, in case they hit the logging condition too often.

If we are able to recalibrate the realtime factor baseline when the user manually rewinds the context time themselves, how is that not the superior answer?

Edit: Re-reading, I guess the proposal was to do both? I'm not sure why we should print a warning though, if we've already fixed the realtime baseline?

Feb 26 '19 21:02 jwnimmer-tri

I'm not in favor of printing a message because I don't think we have a mechanism in Drake to guarantee that the user will see it. Do we have any precedent for that?

Feb 26 '19 21:02 sherm1

It's a separate issue that I noticed when @RussTedrake noticed that one sequence worked and one didn't.

I'd be just fine with throwing an exception here. I can't imagine a case where someone would want to reset the time and not Initialize(.), but they could catch the exception if they really, really need to do that.

Feb 26 '19 21:02 edrumwri

I'm not in favor of printing a message because I don't think we have a mechanism in Drake to guarantee that the user will see it. Do we have any precedent for that?

I added a at-most-once warning printout at

https://github.com/RobotLocomotion/drake/blob/6a8b847e3c2556156d38ee76d84ed5a3ea577759/manipulation/kuka_iiwa/iiwa_command_receiver.cc#L119-L121

because there was a public API with no C++ symbols, but I wanted to warn the user that it was deprecated. I'm not sure if we have any/many more examples of such printing. In a pinch I think it's better than nothing, but I agree we shouldn't count on the user seeing it.

Feb 26 '19 21:02 jwnimmer-tri

I'm not sure why we should print a warning though, if we've already fixed the realtime baseline?

It's a separate issue that I noticed when @RussTedrake noticed that one sequence worked and one didn't.

I'd be just fine with throwing an exception here. I can't imagine a case where someone would want to reset the time and not Initialize(.), but they could catch the exception if they really, really need to do that.

We can't rely on catching exceptions (since we're not allowed to do that in Drake itself anyway).

I think for that larger issue, I'll continue to maintain that Initialize should be burned to the ground, and thus will never need by-hand curation by a user again. I think I've been clear from the start that Initialize is a terrible idea. StepTo should just work, always. It has everything it needs to know. There is no reason the user has to provide it any hints.

Feb 26 '19 21:02 jwnimmer-tri

The user can catch the exception, no? StepTo(.) always lives in user-level code.

Feb 26 '19 22:02 edrumwri

Some users may also have a coding convention that they cannot catch exceptions, in which case "the user may catch the exception if they want to" is not an admissible compromise.

Anyway, it would also be weird to have StepTo both take the step and then throw. (Otherwise, how could the user make progress, even by calling StepTo + catch?)

Feb 26 '19 22:02 jwnimmer-tri

Fair enough. I don't know how we can warn in any reasonable manner that forces the user to be aware of the message.

Feb 26 '19 22:02 edrumwri

Just to check, is this still an issue with AdvanceTo?

Or, really, is there any action left? (Came across this during component triaging)

May 02 '20 18:05 EricCousineau-TRI

We should still pybind ResetStatistics() but likely calling Initialize() is sufficient in most cases. Probably the best fix would be real time monitoring that is limited to a recent time window (as @edrumwri suggested) rather than one going back to an arbitrary start time.

May 02 '20 20:05 sherm1

The ResetStatistics function was bound in pydrake as of #13343. This is no longer a Python-specific issue.

Nov 12 '21 00:11 jwnimmer-tri

Proposal: if PauseIfTooFast notices that the context time has moved backwards, then it will reset (i.e., re-calibrate) the initial_simtime_ and initial_realtime_ automatically. No need for warnings nor within-framework triggers.

Aug 01 '22 15:08 jwnimmer-tri

drake drake copied to clipboard

Simulator's real time rate API

drake
drake copied to clipboard