issue-tracking icon indicating copy to clipboard operation
issue-tracking copied to clipboard

Save log data on disk in addition to online logging

Open ArcaneEmergence opened this issue 3 years ago • 2 comments

Dear Comet ML Team,

Thank you for offering such a great product. I am using it daily for my research and recommended it to my colleagues.

I am working on a cluster. Sometimes the internet gets disconnected for whatever reason. In this case it would be great to have a backup of the data while disconnected, instead of losing it completely. When trying to do a workaround by running online and offline simultaneously, the run will just end the other instance.

Describe the solution you'd like

A flag that enables saving the data on the local disk in addition to logging it to Comet ML online. e.g. experiment = Experiment(save_to_dir="./comet") by default save_to_dir=None won't save anything on the disk as currently. After the run ended, being able to upload the saved data to the same run online.

Describe alternatives you've considered

Allow Experiment and OfflineExperiment run simultaneously as a workaround.

PS: I am working mostly with Pytorch Lightning, so an integration with Lightning would be great, though I guess it's not in your hands.

ArcaneEmergence avatar Feb 23 '22 17:02 ArcaneEmergence

Thank you for the kind words @DeepOzean. Currently, we do not support running OfflineExperiment and Experiment simultaneously. Let me see if I can create a workaround for your use case. I will update you here if I make any progress.

Also, we do have an integration with Pytorch Lightning! Here is the documentation to help you get started. https://pytorch-lightning.readthedocs.io/en/latest/api/pytorch_lightning.loggers.comet.html

Please let me know if you have any more questions!

DN6 avatar Feb 24 '22 14:02 DN6

Hi Dhruv, is there an update on the workaround? Thanks again for your time!

ArcaneEmergence avatar Mar 03 '22 10:03 ArcaneEmergence

I believe allowing Experiment and OfflineExperiment to run simultaneously will be helpful since sometimes I exceeded the rate limit so some experiments can't be logged correctly.

logichen avatar Mar 11 '23 13:03 logichen

Comet now uses an offline fallback mechanism. For more information see: https://research.dev.comet.com/docs/v2/api-and-sdk/python-sdk/advanced/fallback-to-offline/

dsblank avatar Oct 13 '23 12:10 dsblank