AQLM icon indicating copy to clipboard operation
AQLM copied to clipboard

How model_seqlen affects quantization quality

Open VirtualRoyalty opened this issue 1 year ago • 2 comments

Hi! Thanks for such a useful tool! I have a question about model_seqlen:

As I can see default value in main.py is 4096. What if I'll use a smaller values e.g. 1024 when quantizing MoE mixtral model? Will it affect the quality of quantized model? Or quality on greater than 1024 contexts? Will it significantly speedup process of quantization?

Thanks in advance!

    parser.add_argument(
        "--model_seqlen",
        type=int,
        default=4096,
        help="Model seqlen and calibration data context length.",
    )

VirtualRoyalty avatar Mar 10 '24 13:03 VirtualRoyalty

Hi! It is recommended to use the seq_len the model you're quantizing was trained on (4096 for Llama-2, 8192 for mistral/mixtral). To reduce the number of samples, speeding up computations, you should decrease --nsamples instead. However, it doesn't have that large impact on the quantization time anyway.

BlackSamorez avatar Mar 10 '24 18:03 BlackSamorez

@BlackSamorez Thanks for the answer!

I am trying to quantize finetuned version of mixtral and I had no such long samples (8192) in the training set.

Then should I decrease max_epochs and finetune_max_epochs instead (in order to speedup the process)?

VirtualRoyalty avatar Mar 10 '24 22:03 VirtualRoyalty

@VirtualRoyalty you may try and see how shorter sequences affect the quality. When I was tuning Mixtral, i used 7k instead of 8k to fit into memory and this seems to work fine. However, 1k is much shorter than 8k, so I cannot say apriori, whether it matters much.

Godofnothing avatar Mar 14 '24 13:03 Godofnothing

@Godofnothing Thanks, good point!

VirtualRoyalty avatar Mar 14 '24 16:03 VirtualRoyalty

This issue is stale because it has been open for 30 days with no activity.

github-actions[bot] avatar Apr 14 '24 01:04 github-actions[bot]

This issue was closed because it has been inactive for 14 days since being marked as stale.

github-actions[bot] avatar Apr 29 '24 01:04 github-actions[bot]