OmniQuant icon indicating copy to clipboard operation
OmniQuant copied to clipboard

Questions about quantization

Open mxjmtxrm opened this issue 8 months ago • 0 comments

Hi, great work! I met some problems during 4bit weight-only quantization(--lwc).

  1. Is there any problem if the norm is nan?
  2. what's the best lwc hyper-parameter of LLama2 with different scales? like lwc-lr and epochs?
  3. Does more calib data bring better results?

I quantized a llama model using different lwc hyper-parameters and received different results.

  1. nsamples=1000, batch_size=1, epoch=2, the ppl is correct.
  2. nsamples=2000, batch_size=8, epoch=10, the ppl is super large (40000+). What's the problem?

mxjmtxrm avatar Jun 10 '24 08:06 mxjmtxrm