vidur icon indicating copy to clipboard operation
vidur copied to clipboard

A large-scale simulation framework for LLM inference

Results 34 vidur issues
Sort by recently updated
recently updated
newest added

Does Vidur currently support speculative decoding?

Hi, thanks for open-sourcing this project! I have a couple of questions: 1) Regarding CPU overheads (e.g. scheduling, tokenization, etc) - while they’re mentioned in the documentation, from reading the...

We’re attempting to reproduce the simulation results and observed that when comparing against **vLLM 0.9.1** benchmarks, the P50 latency differs by **700%**. May I ask if vLLM v1 is supported...

Hi Vidur Team, We are researching auto-scaling solutions for large models and have found your simulator to be highly valuable for our work! However, the simulator currently only supports static...