tensorrtllm_backend icon indicating copy to clipboard operation
tensorrtllm_backend copied to clipboard

get_parameter(model_config, "max_attention_window_size", int) not support list

Open Alireza3242 opened this issue 1 year ago • 0 comments

System Info

a100

Who can help?

@ncomly-nvidia

Information

  • [X] The official example scripts
  • [ ] My own modified scripts

Tasks

  • [X] An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
  • [ ] My own task or dataset (give details below)

Reproduction

In all_models/inflight_batcher_llm/tensorrt_llm/1/model.py line 422 we have:

get_parameter(model_config, "max_attention_window_size", int),

But i want to set max_attention_window_size as a list. Every layer have a max_attention_window_size. Also i want to set this list in: all_models/inflight_batcher_llm/tensorrt_llm/config.pbtxt:

parameters: {
  key: "max_attention_window_size"
  value: {
    string_value: "${max_attention_window_size}"
  }
}

I use this feature for gemma2:

max_attention_window_size = [8192, 4096]*21

Expected behavior

support list.

actual behavior

not support list. Only support int

additional notes

.

Alireza3242 avatar Aug 05 '24 08:08 Alireza3242