vidur issues

[Bugfix] Fix 1-indexing for replica_id in MetricsStore

4

**Issue** `MetricsStore` uses 1-indexing for `replica_id` at several places even though `replica_id`s are 0-indexed. **Fix** Update 1-indexed usage of `replica_id` in `_replica_memory_usage`, `_replica_busy_time` and `_replica_mfu` inside `MetricsStore` to 0-indexed.

sicario001

Live demo not working

1

wmhst7

Some confusing points in profiling

Hi Vidur Team! I am interested the profiling phase but i found something that confusing me。 In `vidur/execution_time_predictor/sklearn_execution_time_predictor.py:494` ``` if self._replica_config.num_pipeline_stages > 1: send_recv_df = self._load_send_recv_df(self._send_recv_input_file) send_recv_df = self._get_send_recv_df_with_derived_features(send_recv_df) models["send_recv"]...

PinappleUnderTheSea

Regarding the calculation of the duration in the profiling phase

Hello Vidur, Thank you for sharing your work. While reading the code, I encountered a question. I am analyzing the profiling part of the code. The profiling is divided into...

doudouxx

Hybrid execution time predictor to solve extrapolation issues

2

The current default random forest predictor is overfitting on the training set and is not able to generalize the batch execution time prediction to unseen token numbers or batch sizes....

yl3469

Adapting Vidur to vLLM and Profiling CPU Overhead

3

Hi team, I’m working on adapting Vidur, the LLM inference system simulator, to vLLM. Currently, Vidur’s profiling is based on Sarathi-Serve, but I’d like to explore how to make it...

duanzhaol

ImportError: cannot import name 'pos_encoding_ops' from 'sarathi' (/app/software1/vidur/sarathi/init.py)

Hi, could you please help with resolve below issue i have already install the serathi-serve, but still meet this error # python vidur/profiling/attention/main.py \ --models codellama/CodeLlama-34b-Instruct-hf \ --num_gpus 4 Traceback...

tianhao909

hello; how to fix this error; errno: 97 - Address family not supported by protocol

python /app/software1/vidur/vidur/profiling/collectives/main.py \ --num_workers_per_node_combinations 2 \ --collective send_recv 2025-02-14 07:59:21,469 INFO worker.py:1821 -- Started a local Ray instance. 0%| | 0/994 [00:00

tianhao909

hello; how to fix this error; Error importing optional module IPython.core.display

1

$python -m vidur.main \ > --replica_config_device a100 \ > --replica_config_model_name meta-llama/Meta-Llama-3-8B \ > --cluster_config_num_replicas 1 \ > --replica_config_tensor_parallel_size 1 \ > --replica_config_num_pipeline_stages 1 \ > --request_generator_config_type synthetic \ > --synthetic_request_generator_config_num_requests...

tianhao909

Regarding the judgment condition for TFFTViolationLowMemoryCase

I am studying how Vidur determines bottlenecks, and I noticed that both TFFTViolationLowMaxBatchSizeCase and TFFTViolationLowMemoryCase involve checking batch_size_obs. It seems that batch_size_obs should be a specific batch size value, but...

PshySimon

vidur
vidur copied to clipboard

Metadata

[Bugfix] Fix 1-indexing for replica_id in MetricsStore

Live demo not working

Some confusing points in profiling

Regarding the calculation of the duration in the profiling phase

Hybrid execution time predictor to solve extrapolation issues

Adapting Vidur to vLLM and Profiling CPU Overhead

ImportError: cannot import name 'pos_encoding_ops' from 'sarathi' (/app/software1/vidur/sarathi/init.py)

hello; how to fix this error; errno: 97 - Address family not supported by protocol

hello; how to fix this error; Error importing optional module IPython.core.display

Regarding the judgment condition for TFFTViolationLowMemoryCase

← Metadata

Owner

Metadata

vidur vidur copied to clipboard

Metadata

← Metadata

Owner

Metadata

vidur
vidur copied to clipboard