llm.c icon indicating copy to clipboard operation
llm.c copied to clipboard

Input token length question

Open kaizizzzzzz opened this issue 1 year ago • 2 comments

I am a littile confused. In transformer based LLM inference(or training). We always set the input length as the max_input length and padding 0 , or we dynamically get the actual input length and decrease the computation?

kaizizzzzzz avatar Apr 21 '24 15:04 kaizizzzzzz

This issue is about that: https://github.com/karpathy/llm.c/issues/146 Right now we always forward B * T tokens in a single, fixed, batch configuration that never changes. In principle you can dynamically lower B,T dimensions to save computation, but it is tricky and requires thought and tests.

karpathy avatar Apr 21 '24 17:04 karpathy

Thx!

kaizizzzzzz avatar Apr 24 '24 03:04 kaizizzzzzz