AQLM icon indicating copy to clipboard operation
AQLM copied to clipboard

How long for the quantizing a 70b model? I had ran for 2days

Open xiechengmude opened this issue 11 months ago • 2 comments

is it toooo long to quantized a model ?

xiechengmude avatar Mar 04 '24 05:03 xiechengmude

python main.py $MODEL_PATH $DATASET_PATH --nsamples=1024 \ --num_codebooks=1 --nbits_per_codebook=16 --in_group_size=8 \ --relative_mse_tolerance=0.01 --finetune_relative_mse_tolerance=0.001 \ --finetune_batch_size=32 --local_batch_size=1 --offload_activations \ --wandb --save $SAVE_PATH

xiechengmude avatar Mar 04 '24 05:03 xiechengmude

Hello! Thank you for your interest in the project. Yes indeed, AQLM quantization takes considerably longer to calibrate than simpler quantization methods such as GPTQ. This only impacts quantization time, not inference time. Quantization depends on your model size, hardware(number of GPUs , GPUs models e.t.c.) and quantization parameters. I added more details on quantization time in ReadME. Hope this helps. If you have any additional questions, please feel free to ask.

Vahe1994 avatar Mar 04 '24 10:03 Vahe1994

could you share a example script for quantizing a 70b model on 8*A100 ?

xiechengmude avatar Mar 24 '24 08:03 xiechengmude

Hi! Hope this helps: WANDB_PROJECT="wandb_project" WANDB_NAME="wandb_name" HF_HOME="/mnt/LLM" CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 OMP_NUM_THREADS=16 MKL_NUM_THREADS=16 python main.py meta-llama/Llama-2-70b-hf "pajama" --relative_mse_tolerance=0.01 --finetune_relative_mse_tolerance=0.001 --nsamples=2048 --num_codebooks=1 --nbits_per_codebook=16 --in_group_size=8 --finetune_batch_size=32 --local_batch_size=2 --wandb --save="path_to_save"

Vahe1994 avatar Mar 24 '24 15:03 Vahe1994

If you want farther improve ppl, you can additionally run global fine-tuning after you obtained quantized model see https://github.com/Vahe1994/AQLM/pull/50 for the code and see https://github.com/Vahe1994/AQLM/issues/49 for example how to run it.

Vahe1994 avatar Mar 24 '24 15:03 Vahe1994

This issue is stale because it has been open for 30 days with no activity.

github-actions[bot] avatar Apr 24 '24 01:04 github-actions[bot]

This issue was closed because it has been inactive for 14 days since being marked as stale.

github-actions[bot] avatar May 09 '24 01:05 github-actions[bot]