DeepSpeed
DeepSpeed copied to clipboard
fix: remove layer_past storing in DeepSpeedTransformerInference
What Changed?
- this PR fix #1925
- remove layer_past storing in transformer_inference
why?
-
layer_past
should not be stored in model. It should given by input. -
layer_past
never released. If model is repeatly called,layer_past
constantly growing. It's not what we expected.
Can one of the admins verify this patch?
i wrap model with deepspeep after using huggingface transfomers to load model firstly, the i met
Floating point exception (core dumped)
this occur with some sentence as input, not all the sentences
Hi @codertimo - Do you believe this PR is still useful? I'm not able to resolve the conflicts on your fork with our master branch. If so, could you resolve these conflicts, and we can review. Apologies for the delay on reviewing in the first place.