sacred icon indicating copy to clipboard operation
sacred copied to clipboard

Live information not updated if the observer failed mid-run

Open vnmabus opened this issue 6 years ago • 2 comments

I have noticed that, if an observer failed mid-run (for example, if the Mongo database became temporarily unavailable) the heartbeats stop occurring. If the problem is solved when the experiment finishes, then the result is saved and the run is marked as completed BUT the live information that is updated as part of the heartbeat event does not receive a last update, so it is incomplete.

Ideally, I would expect that Sacred retries the heartbeat event after some time has passed, so if the problem was transient the live information continues updating. If that is not desirable for whatever reason, at least I would expect that Sacred updates this information as part of the final save procedure.

vnmabus avatar Jun 13 '18 07:06 vnmabus

Good point, thank you for bringing it up! We should definitely make sure the complete information is stored on final save. Resuming the heartbeat after some time also sounds like a good idea, though that will require a bit more thought. Especially because of error handling which should be improved anyways (see #314).

Qwlouse avatar Jul 01 '18 18:07 Qwlouse

I'm also experiencing this issue, any progress on it?

Gonzalo933 avatar Feb 25 '19 09:02 Gonzalo933