vidur icon indicating copy to clipboard operation
vidur copied to clipboard

A large-scale simulation framework for LLM inference

Results 34 vidur issues
Sort by recently updated
recently updated
newest added

Hello Vidur, Thank you for sharing your work. While reading the code and documentation, I encountered some questions related to the *Profiling Communication Operators* mentioned in the paper. In the...

Hello, I'm struggling to understand a couple of things about the simulator, given there's no documentation around it. The simulation runtime is determined based on the requests (either trace files...

Thank you for the helpful profiling data for the simulation. The "send_recv" metric for pipeline parallelism indicates cross-node communication, but I fail to find documentation on the specific network device...

FILL IN THE PR DESCRIPTION HERE FIX #xxxx (*link existing issues this PR will resolve*) **BEFORE SUBMITTING, PLEASE READ THE CHECKLIST BELOW AND FILL IN THE DESCRIPTION ABOVE** --- PR...

Currently Vidur only supports dense model with TP/PP, do you plan to support EP in the future?

Dear maintainers, Recently I attempted using the profiling workflow in the vidur project and collect profiling data on AWS EC2 instances. I experimented with the P5 48X which has 8X...

Does the `num_prefill_tokens` for round n correspond solely to the prompt of that round, or does it include all prompts from rounds 1 to n and the outputs from rounds...

- The embedding_dim is already enforced to be a multiple of the num_q_heads (which in turn is enforced to be a multiple of the tensor_parallel_size). Removing the redundant check. -...

FILL IN THE PR DESCRIPTION HERE I added some notes on how the internal of the simulator is organized. I find it helpful for me in understanding how things work...