bug: Bart speedup only 1.6x

Open sinking-point opened this issue 2 years ago • 0 comments

Description

I was immensely impressed by the 7.3x speedup demonstrated in the t5 tutorial (though I was only able to reproduce 4.0x on my machine, still pretty good).

However, I can only speed up Bart by 1.6x in the same way.

I have determined this is not due to the difference in model size, as t5-base (which is larger than bart-base) is sped up 3.8x.

Steps to reproduce

Use the t5 notebook but replace:

't5-small' with 'facebook/bart-base'

optimize_model(model.encoder) with optimize_model(model.model.encoder)

optimize_model(model.decoder) with optimize_model(model.model.decoder)

Expected Behavior

Speedup comparable to that seen in T5.

Actual Behavior

A comparatively small 1.6x speedup.

Your environment

The docker container from the README.

Self-service

[X] I would be willing to help fix this bug myself.

Code of Conduct

[X] I agree to follow this project's Code of Conduct

Jul 13 '23 10:07 sinking-point