DeepSpeed
DeepSpeed copied to clipboard
[BUG] DeepSpeed Inference - T5 Model
Describe the bug I used deepspeed inference like below:
model = (
T5ForConditionalGeneration.from_pretrained(
"paust/pko-t5-large",
).half().eval().to(torch.cuda.current_device())
)
model = deepspeed.init_inference(
self.model,
mp_size=8,
dtype=torch.float,
injection_policy={T5Block: ('SelfAttention.o', 'EncDecAttention.o', 'DenseReluDense.wo')}
)
It works well. (I use A100 (80GB) x 8)
But, when I change the model size(large => base), It not works.
model = (
T5ForConditionalGeneration.from_pretrained(
"paust/pko-t5-base",
).half().eval().to(torch.cuda.current_device())
)
model = deepspeed.init_inference(
self.model,
mp_size=8,
dtype=torch.float,
injection_policy={T5Block: ('SelfAttention.o', 'EncDecAttention.o', 'DenseReluDense.wo')}
)
With below error message:
File "../../lib/python3.9/site-packages/transformers/models/t5/modeling_t5.py", line 559, in forward
scores += position_bias_masked
RuntimeError: The size of tensor a (384) must match the size of tensor b (256) at non-singleton dimension 3
I set max_length to 256. I checked that the size of the scores is 1.5 times the max_length I set. Can you tell me why?