llama icon indicating copy to clipboard operation
llama copied to clipboard

Padding for training and inference

Open Reason-Wang opened this issue 9 months ago • 5 comments

Is llama 2 trained with batch? If so, why there is no pad token? If I want to finetune the model and then inference in batch. A suggestion is to pad from left. I know I should pad from left for inference. Should I pad from left in finetuning?

Reason-Wang avatar Sep 14 '23 16:09 Reason-Wang