gloritygithub11 comments

Results 22 comments of


                                            gloritygithub11

Fail to build Llama-3-70B-Instruct with w4a16

@ming-wei Thanks. I will try it

Fail to build Llama-3-70B-Instruct with w4a16

@ming-wei I have synced to commit "[Fix mistral v0.1 build instructions (#1373)]", now it failed in convert with error: ``` python ../llama/convert_checkpoint.py --model_dir /mnt/memory/Meta-Llama-3-70B-Instruct --output_dir /app/models/tmp/trt_models/Meta-Llama-3-70B-Instruct/w4a16/1-gpu-tp --dtype float16 --use_weight_only --weight_only_precision...

Fail to build Llama-3-70B-Instruct with w4a16

@byshiue I had tried 8B and get the same error before. Noticed that you are using new version 0.11.0.dev2024052100. I will try this version.

Fail to build Llama-3-70B-Instruct with w4a16

@byshiue I sync code to [Update TensorRT-LLM (](https://github.com/NVIDIA/TensorRT-LLM/commit/5d8ca2faf74c494f220c8f71130340b513eea9a9)https://github.com/NVIDIA/TensorRT-LLM/pull/1639[)](https://github.com/NVIDIA/TensorRT-LLM/commit/5d8ca2faf74c494f220c8f71130340b513eea9a9) still get the same error. It is trying to check the loaded model contains quantized param like transformer.layers.0.attention.qkv.per_channel_scale ``` python ../llama/convert_checkpoint.py --model_dir...

Fail to build Llama-3-70B-Instruct with w4a16

I also tried in a clean docker envi, the same error.

Fail to build Llama-3-70B-Instruct with w4a16

works now. Thank you very mush

Fail to build int4_awq on Mixtral 8x7b

Thanks @byshiue for the response. Will it be supported at sometime in future?

Fail to build int4_awq on Mixtral 8x7b

@byshiue is there an expected date on this support?

Fail to build int4_awq on Mixtral 8x7b

Hi @nv-guomingz, I still get the similar error: ``` set -ex export MODEL_DIR=/models export MODEL_NAME=Mixtral-8x7B-Instruct-v0.1 export QUANTIZE=int4_awq export DTYPE=float16 export TORCH_CUDA_ARCH_LIST="8.0" python3 ../quantization/quantize.py \ --model_dir $MODEL_DIR/${MODEL_NAME} \ --output_dir $MODEL_DIR/tmp/trt_models/${MODEL_NAME}/$QUANTIZE/1-gpu \...

Fail to build int4_awq on Mixtral 8x7b

hi @nv-guomingz is there update one the issue?