TensorRT-LLM icon indicating copy to clipboard operation
TensorRT-LLM copied to clipboard

NotImplementedError: Cannot copy out of meta tensor; no data

Open LIUKAI0815 opened this issue 1 year ago • 1 comments

python convert_checkpoint.py --model_dir /workspace/lk/model/Qwen/14B/ --output_dir ./tllm_checkpoint_1gpu_fp16_wq --dtype float16 --use_weight_only --weight_only_precision int8 [TensorRT-LLM] TensorRT-LLM version: 0.10.0.dev2024042300 0.10.0.dev2024042300 Loading checkpoint shards: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 8/8 [00:02<00:00, 3.26it/s] [04/30/2024-09:44:33] Some parameters are on the meta device device because they were offloaded to the cpu. Traceback (most recent call last): File "/workspace/lk/model/tensorRT/TensorRT-LLM/examples/qwen/convert_checkpoint.py", line 365, in main() File "/workspace/lk/model/tensorRT/TensorRT-LLM/examples/qwen/convert_checkpoint.py", line 357, in main convert_and_save_hf(args) File "/workspace/lk/model/tensorRT/TensorRT-LLM/examples/qwen/convert_checkpoint.py", line 319, in convert_and_save_hf execute(args.workers, [convert_and_save_rank] * world_size, args) File "/workspace/lk/model/tensorRT/TensorRT-LLM/examples/qwen/convert_checkpoint.py", line 325, in execute f(args, rank) File "/workspace/lk/model/tensorRT/TensorRT-LLM/examples/qwen/convert_checkpoint.py", line 305, in convert_and_save_rank qwen = from_hugging_face( File "/opt/conda/envs/tensorRT/lib/python3.10/site-packages/tensorrt_llm/models/qwen/convert.py", line 1087, in from_hugging_face weights = load_weights_from_hf(config=config, File "/opt/conda/envs/tensorRT/lib/python3.10/site-packages/tensorrt_llm/models/qwen/convert.py", line 1193, in load_weights_from_hf weights = convert_hf_qwen( File "/opt/conda/envs/tensorRT/lib/python3.10/site-packages/tensorrt_llm/models/qwen/convert.py", line 747, in convert_hf_qwen get_tllm_linear_weight(qkv_w, tllm_prex + 'attention.qkv.', File "/opt/conda/envs/tensorRT/lib/python3.10/site-packages/tensorrt_llm/models/qwen/convert.py", line 487, in get_tllm_linear_weight v.cpu(), plugin_weight_only_quant_type) NotImplementedError: Cannot copy out of meta tensor; no data!

LIUKAI0815 avatar Apr 30 '24 02:04 LIUKAI0815

related to https://github.com/NVIDIA/TensorRT-LLM/issues/1440

lkm2835 avatar Apr 30 '24 12:04 lkm2835

As mentioned by lkm2835, it is because you don't have enough cpu memory to put the model. You could try using the device with larger cpu memory.

byshiue avatar May 10 '24 08:05 byshiue

Thank you. It's been solved

LIUKAI0815 avatar May 14 '24 03:05 LIUKAI0815

LIUKAI0815 How did you solve this problem?

AntonThai2022 avatar Jun 15 '24 05:06 AntonThai2022