[bloom] Add alibi cache for fixed inputs
Feature request
Add alibi cache for BLOOM
Motivation
Many training scenarios involve fixed input length. In which case re-creating the deterministic alibi tensor that is the same on each forward is a waste. The other similar scenario is when one groups inputs into bins with padding to the fixed length of the bin.
I propose we add a small cache - even of 1 value to speed up these scenarios.
So a 1-value cache would be that we save the tensor in the BloomModel object and rebuild it only if tensor's length needs to change.
@younesbelkada
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.
Please note that issues that do not follow the contributing guidelines are likely to be ignored.