Luka Govedič
Luka Govedič
Just saw this in a [comment](https://github.com/vllm-project/vllm/pull/17280#issuecomment-3013759134): > Furthermore the very first request after starting up vLLM takes 30-60 seconds. Feels like PTX being compiled or something. This only happens on...
This might be the cause of the other issue I filed (llama4 on Blackwell) but this issue is llama4 AND llama3 on hopper
It's not letting me upload the file on GitHub so here's a [Google Drive link](https://drive.google.com/file/d/1szYpuBoQnZ0Xo4SuBbSaTG38k7KImUyS/view?usp=drive_link), let me know if that works