sacred icon indicating copy to clipboard operation
sacred copied to clipboard

handle failed MongoObserver

Open Mastercorp opened this issue 5 years ago • 7 comments

I am having some problems with the MongoObserver. While running my program, i lost the internet connection for some time. As a result, MongoObserver failed at some point during the run. ( 2-3 hour run ) Furthermore, if no internet connection is present at the beginning, the program does not even start.

It would be nice, if a local copy of the information which should be send to the mongodb is saved, when MongoObserver fails to connect and/or no internet connection is present . Additonaly, a MongoObserver.push("filedump") which uses the saved dump and pushes it to the server when called.

Mastercorp avatar Mar 02 '19 20:03 Mastercorp

I've had days-long runs drop out on me (db on same host as experiment!) and get truncated to only the first few hours.

yet-another-account avatar Mar 08 '19 05:03 yet-another-account

I think this is related with #317.

vnmabus avatar Mar 08 '19 09:03 vnmabus

#471 provides a queue based mongo observer that should work around connectivity issues. I is probably not yet perfect, but I am happy for everyone who tries it out and gives feedback. Usage should be as simple as

from sacred.observers.mongo import QueuedMongoObserver
...
ex.observers.append(
    QueuedMongoObserver.create(**usual_mongo_options)
)

JarnoRFB avatar Jun 03 '19 09:06 JarnoRFB

@JarnoRFB Is it possible to add this feature to all observers by default? If it's able to recover from a connection interrupt, this would be a good feature to have.

Guptajakala avatar Jan 15 '20 15:01 Guptajakala

It in fact is implemented in a way that can be used with all observers. Refer to this section of the documentation https://sacred.readthedocs.io/en/stable/observers.html#queue-observer for more information.

JarnoRFB avatar Jan 15 '20 20:01 JarnoRFB

@JarnoRFB Every time I use queue observer, it didnt send the metric properly. From the omniboard, I see the status becomes dead. Is there anything Im doing wrong?

Guptajakala avatar Feb 10 '20 14:02 Guptajakala

@JarnoRFB Never mind, I've figured out my problem. It now works great!

Guptajakala avatar Feb 10 '20 15:02 Guptajakala