mlem icon indicating copy to clipboard operation
mlem copied to clipboard

MLEM-loaded model performs consistently worse

Open rocco-fortuna opened this issue 2 years ago • 4 comments

I have a Pytorch text classification model I cannot disclose the architecture of. Whenever the model is loaded with the relative library, it consistently performs slightly better than the model saved and then loaded with MLEM. As detailed on the Discord discussion with @aguschin:

It's a Pytorch sequence classification model. Ran the eval four times each:

  1. the original model
  2. the mlem_model saved and loaded with:
# load the model with Pytorch model class
model = MyModel.from_pretrained('./model_path')

# save
from mlem.api import save
save(model, "./checkpoints/v070_mlem")

#
from mlem.api import load
mlem_model = load("./checkpoints/v070_mlem")

And did eval 4 times each on 5k samples, getting the accuracies:

  1. original:
  • 0.7868
  • 0.7874
  • 0.7844
  • 0.7864
  1. mlem_model:
  • 0.7778
  • 0.783
  • 0.7808
  • 0.7816

So almost the same, but consistently lower by about 0.6% on average.

rocco-fortuna avatar Mar 13 '23 13:03 rocco-fortuna

@aguschin mentioned:

I think we can [try] one of Pytorch examples to see if this can be reproduced there. If that won't help us, we can try to dig deeper into some specifics.

Let me know if you need any additional info.

rocco-fortuna avatar Mar 13 '23 13:03 rocco-fortuna

@mike0sv do you have any ideas why this could be the case?

aguschin avatar Mar 14 '23 08:03 aguschin

Under the hood, mlem save and loads model with torch.save and torch.load (or torch.jit.save and torch.jit.load). We do not do anything else with the model. Can you confirm that this logic is to blame by running something like this

# load the model with Pytorch model class
model = MyModel.from_pretrained('./model_path')

# save
torch.save(model, "...")

#
model = torch.load("...")

and running evaluation? If your model is isinstance(model, torch.jit.ScriptModule), use torch.jit

mike0sv avatar Mar 14 '23 13:03 mike0sv

That yielded:

  • 0.7832
  • 0.7862
  • 0.7888
  • 0.7892

Consistently with the original model's performance

rocco-fortuna avatar Mar 16 '23 20:03 rocco-fortuna