mljet icon indicating copy to clipboard operation
mljet copied to clipboard

Replace default pickle serializer with joblib

Open qnbhd opened this issue 3 years ago • 0 comments

We currently use Pickle as a serializer for machine learning models.

There are the following alternatives:

  • joblib - a serializer that is backward compatible with Pickle, but is more optimized for big data and also faster.
  • dill - serializer, backward compatible with Pickle, which allows to serialize more objects, including lambda functions.

Based on our subject area, Joblib is the most suitable.

Steps to accomplish this task:

  • [ ] Add joblib requirements to all backends templates.
  • [ ] Replace pickle calls in backends server files.
  • [ ] Replace serialization in project builder.
  • [ ] Update CLI commands.
  • [ ] Check for other pickle usages.

It is worth noting that we should support importing existing model dumps with different serializers, especially through the CLI interface. Another task will be created based on this.

СС: @pacifikus

qnbhd avatar Nov 25 '22 10:11 qnbhd