quaterion icon indicating copy to clipboard operation
quaterion copied to clipboard

Properly restore encoders defined and saved in notebooks

Open monatis opened this issue 3 years ago • 7 comments

Needs attension and discussion. It's particularly important when users work in places such as Colab.

monatis avatar Feb 24 '22 09:02 monatis

Instead of just saving the names of the module and the class to import them, what about using dill to pickle the class directly?

WDYT? @generall and @joein

monatis avatar Mar 11 '22 03:03 monatis

Just to clarify the behavior we want to achieve:

  • define all classes, encoders, trainable models in Colab
  • Use save_servable after training
  • Be able to restore from the same colab afterwards

Is that correct?

generall avatar Mar 11 '22 08:03 generall

Not from the same notebook actually. My initial consideration was defining encoders in notebooks, e.g., on Colab, training models, saving servable and then using it elsewhere outside the notebook. But it does not seem to be that easy in either way.

monatis avatar Mar 11 '22 09:03 monatis

Another idea might be giving users a simple utility to create a boilerplate, e.g., quaterion new project-name may generate a basic template with dependencies defined, encoders.py, training.py, inference.py, notebook.ipynb etc. This may help users structure their projects correctly, and make experiments and inference quickly.

This may sound a little bit overkill, but documenting the correct project structure, emphasizing its importance and answering the questions about problems issues in the future may be much more difficult.

monatis avatar Mar 11 '22 09:03 monatis

cookie-cutter template is a good idea, actually. I like it

generall avatar Mar 11 '22 10:03 generall

It also makes a good competitive advantage to similar projects, and easily reproduceable projects may help accelerate the adoption.

Raising a separate issue for this.

monatis avatar Mar 11 '22 10:03 monatis

I guess we can do something with cell magics for this issue.

Other alternatives such as class serialization etc. are neither reliable nor safe.

monatis avatar Jun 08 '22 17:06 monatis