`max_batch_size` argument in `ModelArgs`

Open eminorhan opened this issue 1 year ago • 2 comments

I'm just curious what the max_batch_size argument does in ModelArgs: https://github.com/pytorch/torchtitan/blob/d2a4904f58accc683c17c66a360026cb3c8109af/torchtitan/models/llama/model.py#L32

A quick search suggests that it doesn't actually seem to be used anywhere else in the code base, so I'm wondering if it might be superfluous.

Sep 18 '24 20:09 eminorhan

I think this argument currently serve as a placeholder and may be used in future. What do you think? @lessw2020 @tianyu-l

Sep 19 '24 00:09 XilunWu

I think it was copied from the original reference Llama implementation, which was meant for inference (code) and the max_batch_size was used for the KV cache.

We should probably remove it.

Sep 19 '24 07:09 awgu

#585

Sep 19 '24 18:09 XilunWu