TensorRT-LLM icon indicating copy to clipboard operation
TensorRT-LLM copied to clipboard

AttributeError: 'PluginConfig' object has no attribute '_streamingllm'. Did you mean: '_streamingllm'?

Open BooHwang opened this issue 1 year ago • 6 comments

System Info

  • CPU: X86
  • GPU: NVIDIA L20
  • python
    • tensorrt 10.3.0
    • tensorrt-cu12 10.3.0
    • tensorrt-cu12-bindings 10.3.0
    • tensorrt-cu12-libs 10.3.0
    • tensorrt-llm 0.13.0.dev2024081300

Who can help?

No response

Information

  • [X] The official example scripts
  • [ ] My own modified scripts

Tasks

  • [ ] An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
  • [ ] My own task or dataset (give details below)

Reproduction

step 1:

pip install -r requirements.txt
apt-get update
apt-get install git-lfs
rm -rf chatglm*

git clone https://huggingface.co/THUDM/chatglm3-6b-base chatglm3_6b_base

step 2(that step get success):

python3 convert_checkpoint.py --model_dir chatglm3_6b_base --output_dir trt_ckpt/chatglm3_6b/fp16/1-gpu

step 3(get error at this step):

trtllm-build --checkpoint_dir trt_ckpt/chatglm3_6b_base/fp16/1-gpu \
        --gemm_plugin float16 \
        --output_dir trt_engines/chatglm3_6b_base/fp16/1-gpu

Expected behavior

when I operate as example, however get error at the step of:

trtllm-build --checkpoint_dir trt_ckpt/ --gemm_plugin float16 --output_dir trt_engines

actual behavior

[TensorRT-LLM] TensorRT-LLM version: 0.13.0.dev2024081300
[08/15/2024-06:04:26] [TRT-LLM] [I] Compute capability: (8, 9)
[08/15/2024-06:04:26] [TRT-LLM] [I] SM count: 92
[08/15/2024-06:04:26] [TRT-LLM] [I] SM clock: 2520 MHz
[08/15/2024-06:04:26] [TRT-LLM] [I] int4 TFLOPS: 474
[08/15/2024-06:04:26] [TRT-LLM] [I] int8 TFLOPS: 237
[08/15/2024-06:04:26] [TRT-LLM] [I] fp8 TFLOPS: 237
[08/15/2024-06:04:26] [TRT-LLM] [I] float16 TFLOPS: 118
[08/15/2024-06:04:26] [TRT-LLM] [I] bfloat16 TFLOPS: 118
[08/15/2024-06:04:26] [TRT-LLM] [I] float32 TFLOPS: 59
[08/15/2024-06:04:26] [TRT-LLM] [I] Total Memory: 44 GiB
[08/15/2024-06:04:26] [TRT-LLM] [I] Memory clock: 9001 MHz
[08/15/2024-06:04:26] [TRT-LLM] [I] Memory bus width: 384
[08/15/2024-06:04:26] [TRT-LLM] [I] Memory bandwidth: 864 GB/s
[08/15/2024-06:04:26] [TRT-LLM] [I] PCIe speed: 16000 Mbps
[08/15/2024-06:04:26] [TRT-LLM] [I] PCIe link width: 16
[08/15/2024-06:04:26] [TRT-LLM] [I] PCIe bandwidth: 32 GB/s
Traceback (most recent call last):
  File "/root/anaconda3/envs/trt_llm/bin/trtllm-build", line 8, in <module>
    sys.exit(main())
  File "/root/anaconda3/envs/trt_llm/lib/python3.10/site-packages/tensorrt_llm/commands/build.py", line 528, in main
    parallel_build(model_config, ckpt_dir, build_config, args.output_dir,
  File "/root/anaconda3/envs/trt_llm/lib/python3.10/site-packages/tensorrt_llm/commands/build.py", line 394, in parallel_build
    passed = build_and_save(rank, rank % workers, ckpt_dir,
  File "/root/anaconda3/envs/trt_llm/lib/python3.10/site-packages/tensorrt_llm/commands/build.py", line 361, in build_and_save
    engine = build_model(build_config,
  File "/root/anaconda3/envs/trt_llm/lib/python3.10/site-packages/tensorrt_llm/commands/build.py", line 303, in build_model
    assert not build_config.plugin_config.streamingllm or architecture == "LlamaForCausalLM", \
  File "/root/anaconda3/envs/trt_llm/lib/python3.10/site-packages/tensorrt_llm/plugin/plugin.py", line 79, in prop
    field_value = getattr(self, storage_name)
AttributeError: 'PluginConfig' object has no attribute '_streamingllm'. Did you mean: '_streamingllm'?

additional notes

[TensorRT-LLM] TensorRT-LLM version: 0.13.0.dev2024081300
[08/15/2024-06:04:26] [TRT-LLM] [I] Compute capability: (8, 9)
[08/15/2024-06:04:26] [TRT-LLM] [I] SM count: 92
[08/15/2024-06:04:26] [TRT-LLM] [I] SM clock: 2520 MHz
[08/15/2024-06:04:26] [TRT-LLM] [I] int4 TFLOPS: 474
[08/15/2024-06:04:26] [TRT-LLM] [I] int8 TFLOPS: 237
[08/15/2024-06:04:26] [TRT-LLM] [I] fp8 TFLOPS: 237
[08/15/2024-06:04:26] [TRT-LLM] [I] float16 TFLOPS: 118
[08/15/2024-06:04:26] [TRT-LLM] [I] bfloat16 TFLOPS: 118
[08/15/2024-06:04:26] [TRT-LLM] [I] float32 TFLOPS: 59
[08/15/2024-06:04:26] [TRT-LLM] [I] Total Memory: 44 GiB
[08/15/2024-06:04:26] [TRT-LLM] [I] Memory clock: 9001 MHz
[08/15/2024-06:04:26] [TRT-LLM] [I] Memory bus width: 384
[08/15/2024-06:04:26] [TRT-LLM] [I] Memory bandwidth: 864 GB/s
[08/15/2024-06:04:26] [TRT-LLM] [I] PCIe speed: 16000 Mbps
[08/15/2024-06:04:26] [TRT-LLM] [I] PCIe link width: 16
[08/15/2024-06:04:26] [TRT-LLM] [I] PCIe bandwidth: 32 GB/s
Traceback (most recent call last):
  File "/root/anaconda3/envs/trt_llm/bin/trtllm-build", line 8, in <module>
    sys.exit(main())
  File "/root/anaconda3/envs/trt_llm/lib/python3.10/site-packages/tensorrt_llm/commands/build.py", line 528, in main
    parallel_build(model_config, ckpt_dir, build_config, args.output_dir,
  File "/root/anaconda3/envs/trt_llm/lib/python3.10/site-packages/tensorrt_llm/commands/build.py", line 394, in parallel_build
    passed = build_and_save(rank, rank % workers, ckpt_dir,
  File "/root/anaconda3/envs/trt_llm/lib/python3.10/site-packages/tensorrt_llm/commands/build.py", line 361, in build_and_save
    engine = build_model(build_config,
  File "/root/anaconda3/envs/trt_llm/lib/python3.10/site-packages/tensorrt_llm/commands/build.py", line 303, in build_model
    assert not build_config.plugin_config.streamingllm or architecture == "LlamaForCausalLM", \
  File "/root/anaconda3/envs/trt_llm/lib/python3.10/site-packages/tensorrt_llm/plugin/plugin.py", line 79, in prop
    field_value = getattr(self, storage_name)
AttributeError: 'PluginConfig' object has no attribute '_streamingllm'. Did you mean: '_streamingllm'?

BooHwang avatar Aug 15 '24 06:08 BooHwang

How about referring this one ?: https://github.com/NVIDIA/TensorRT-LLM/issues/1968#issuecomment-2252750163

Kefeng-Duan avatar Aug 15 '24 09:08 Kefeng-Duan

I fixed this issue by editing plugin/plugin.py and changing all the dataclass fields in PluginConfig to have init=True

aninrusimha avatar Aug 15 '24 22:08 aninrusimha

How about referring this one ?: #1968 (comment)

I had do that before report this issue, not work for me.

BooHwang avatar Aug 16 '24 03:08 BooHwang

I fixed this issue by editing plugin/plugin.py and changing all the dataclass fields in PluginConfig to have init=True

It work for me, but the engine need verify correctness, thx for your help.

BooHwang avatar Aug 16 '24 04:08 BooHwang

@BooHwang https://github.com/NVIDIA/TensorRT-LLM/blob/main/examples/llama/README.md#run-llama-with-streamingllm sorry, could you try --streamingllm enable when building the engine?

Kefeng-Duan avatar Aug 19 '24 08:08 Kefeng-Duan

This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 15 days."

github-actions[bot] avatar Oct 05 '24 02:10 github-actions[bot]

I fixed this issue by editing plugin/plugin.py and changing all the dataclass fields in PluginConfig to have init=True

It works,tks.

zhangyu68 avatar Dec 13 '24 12:12 zhangyu68

This is quite an old issue, but I wanted to share, just for your information, that I’ve confirmed these two commands now work without triggering the previously reported problem.

# convert
python ${TRTLLM_DIR}/examples/models/core/glm-4-9b/convert_checkpoint.py \
        --model_dir chatglm3_6b_base \
        --output_dir chatglm3_6b_fp16_1_gpu

# build
trtllm-build --checkpoint_dir chatglm3_6b_fp16_1_gpu \
        --gemm_plugin float16 \
        --output_dir chatglm3_6b_fp16_1_gpu_built

I'll close this ticket, but feel free to open another issue if needed~ :)

karljang avatar Sep 22 '25 18:09 karljang