Nitin Kedia comments

Results 13 comments of


                                            Nitin Kedia

Support simulation of 4090 cards

Hi @lhpp1314, the `Supported Models` section in the project [README](https://github.com/microsoft/vidur/tree/main?tab=readme-ov-file#supported-models) has the updated link for profiling. Please use the central readme to find links to other documentation.

Can this simulator monitor the actual memory usage instead of memory reserved?

Hi @JasonZhang517, It is certainly possible to export the actual memory used by the requests instead of the reserved memory. For scheduling algorithms which do not use dynamic memory allocation...

Can this simulator monitor the actual memory usage instead of memory reserved?

@AgrawalAmey I believe the question is about amount of KV cache memory which is being used for a token and not just reserved.

How to use Vidur-Search?

Hi @EheinWang and @ozcanmiraay, we have documentation on how to run Vidur Search aka config explorer at [docs/config_explorer.md](https://github.com/microsoft/vidur/blob/main/docs/config_explorer.md). Please check it out.

Supported Simulation Parameters Link Doesn't Work

@ozcanmiraay, for models other than LLama3 ones, `scheduler_config_batch_size_cap = 128` and `request_length_generator_config_max_tokens = 4096` are the maximum. For llama3, the maximums are 512 and 16k respectively. Some more details regarding...

Adding new model

Hi @spliii, create a new class for the model you want to add at [model_config.py](https://github.com/microsoft/vidur/blob/main/vidur/config/model_config.py). Earlier we used to use yaml files for config but we shifted to using dataclasses...

Error: can't register atexit after shutdown

Hi @rajeshitshoulders @akaashrp and @ozcanmiraay, Can you please try removing `jupyterlab` dependency from `environment.yml`. You'll need to create a new mamba environment with the changed `environment.yml` file. `IPython` is used...

Error: can't register atexit after shutdown

Hi @rajeshitshoulders @vladandrew @Yogaht @akaashrp, Vidur now uses `seaborn` (based on `matplotlib`) instead of `plotly` (which uses `kaleido`). This removes dependency on `kaliedo` and hence the original error should not...

Hybrid execution time predictor to solve extrapolation issues

Hi @yl3469 , I spent quite a bit of time with this PR because the core issue it aims to solve (of extrapolating to say higher context length) makes Vidur...

[Bugfix] Fix 1-indexing for replica_id in MetricsStore

@sicario001 Please look at the comments I have left and also accept the Contributor License Agreement (without which the PR cannot be merged).