ChatGLM3 6B Multi-batch Failed with Error
System Info
- CPU: INTEL RPL
- GPU Name: NVIDIA GTX 4090
- TensorRT-LLM: tensorrt_llm==0.11.0.dev2024060400
- Container Used: Yes and reproduced in Conda as well
- Driver Version: 555.42.02
- CUDA Version: 12.5
- OS: Ubuntu 24.04
- Docker Img: nvidia/cuda:12.5.0-devel-ubuntu22.04
Who can help?
@hijkzzz
Information
- [X] The official example scripts
- [ ] My own modified scripts
Tasks
- [X] An officially supported task in the
examplesfolder (such as GLUE/SQuAD, ...) - [ ] My own task or dataset (give details below)
Reproduction
python3 benchmarks/python/benchmark.py --engine_dir /trt_engines/chatglm3_6b/float16/1-gpu/ --dtype float16 --batch_size 16 --input_output_len "1024,512"
Expected behavior
pass with output
actual behavior
[TensorRT-LLM] TensorRT-LLM version: 0.11.0.dev2024061100
Allocated 770.13 MiB for execution context memory.
/usr/local/lib/python3.10/dist-packages/torch/nested/__init__.py:166: UserWarning: The PyTorch API of nested tensors is in prototype stage and will change in the near future. (Triggered internally at ../aten/src/ATen/NestedTensorImpl.cpp:178.)
return _nested.nested_tensor(
[06/13/2024-03:26:19] [TRT] [E] 3: [executionContext.cpp::setInputShape::2068] Error Code 3: API Usage Error (Parameter check failed at: runtime/api/executionContext.cpp::setInputShape::2068, condition: satisfyProfile Runtime dimension does not satisfy any optimization profile.)
[06/13/2024-03:26:19] [TRT] [E] 3: [executionContext.cpp::setInputShape::2068] Error Code 3: API Usage Error (Parameter check failed at: runtime/api/executionContext.cpp::setInputShape::2068, condition: satisfyProfile Runtime dimension does not satisfy any optimization profile.)
[06/13/2024-03:26:19] [TRT] [E] 3: [executionContext.cpp::resolveSlots::2842] Error Code 3: API Usage Error (Parameter check failed at: runtime/api/executionContext.cpp::resolveSlots::2842, condition: allInputDimensionsSpecified(routine) )
Traceback (most recent call last):
File "/TensorRT-LLM/benchmarks/python/benchmark.py", line 416, in main
benchmarker.run(inputs, config)
File "/TensorRT-LLM/benchmarks/python/gpt_benchmark.py", line 254, in run
self.decoder.decode_batch(inputs[0],
File "/usr/local/lib/python3.10/dist-packages/tensorrt_llm/runtime/generation.py", line 3240, in decode_batch
return self.decode(input_ids,
File "/usr/local/lib/python3.10/dist-packages/tensorrt_llm/runtime/generation.py", line 947, in wrapper
ret = func(self, *args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/tensorrt_llm/runtime/generation.py", line 3463, in decode
return self.decode_regular(
File "/usr/local/lib/python3.10/dist-packages/tensorrt_llm/runtime/generation.py", line 3073, in decode_regular
should_stop, next_step_tensors, tasks, context_lengths, host_context_lengths, attention_mask, context_logits, generation_logits, encoder_input_lengths = self.handle_per_step(
File "/usr/local/lib/python3.10/dist-packages/tensorrt_llm/runtime/generation.py", line 2732, in handle_per_step
raise RuntimeError(f"Executing TRT engine failed step={step}!")
RuntimeError: Executing TRT engine failed step=0!
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/TensorRT-LLM/benchmarks/python/benchmark.py", line 515, in <module>
main(args)
File "/TensorRT-LLM/benchmarks/python/benchmark.py", line 441, in main
e.with_traceback())
TypeError: BaseException.with_traceback() takes exactly one argument (0 given)
[06/13/2024-03:26:25] [TRT-LLM] [W] Logger level already set from environment. Discard new verbosity: error
[TensorRT-LLM] TensorRT-LLM version: 0.11.0.dev2024061100
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/usr/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main
exitcode = _main(fd, parent_sentinel)
File "/usr/lib/python3.10/multiprocessing/spawn.py", line 126, in _main
self = reduction.pickle.load(from_parent)
File "/usr/lib/python3.10/multiprocessing/synchronize.py", line 110, in __setstate__
self._semlock = _multiprocessing.SemLock._rebuild(*state)
FileNotFoundError: [Errno 2] No such file or directory
additional notes
If I run without --input_output_len, it should be ok.
Confirm this is a bug, investigating internally.
This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 15 days."
have met same error, it will happen when I set batch_size > 8
Hi @RobinJYM Could u please try the latest code base to see if issue still exist or not?