LightCompress icon indicating copy to clipboard operation
LightCompress copied to clipboard

Does SmoothQuant support W8A8 (activation per tensor static quant) for DeepSeek-R1?

Open taishan1994 opened this issue 9 months ago • 2 comments

When I use the following configuration file:

base:
    seed: &seed 42
model:
    type: DeepseekV3
    path: xxx
    tokenizer_mode: fast
    torch_dtype: torch.float8_e4m3fn
calib:
    name: pileval
    download: False
    path: xxx
    n_samples: 128
    bs: 1
    seq_len: 512
    preproc: txt_general_preproc
    seed: *seed
quant:
    method: SmoothQuant
    weight:
        bit: 8
        symmetric: True
        granularity: per_channel
    act:
        bit: 8
        symmetric: True
        granularity: per_tensor
        static: True
    special:
        alpha: 0.8
save:
    save_vllm: True
    save_path: xxx

An error occurred. BaseBlockwiseQuantization.update_input_feat() misssing 1 reqiured positional argument: 'is_gqa'

taishan1994 avatar Mar 24 '25 08:03 taishan1994

It doesn't look like an issue with the particular model, but rather a bug in the code - SmoothQuant code wasn't adjusted to modification in function update_input_feat(). In my understanding, V3 does not use GQA, so I would advice (as a patch) setting is_gqa=False in the function call from smoothquant.py. Hope it helps.

[Note: I'm not an author in this repo, just an active user]

sasha-hailo avatar Mar 25 '25 11:03 sasha-hailo

It doesn't look like an issue with the particular model, but rather a bug in the code - SmoothQuant code wasn't adjusted to modification in function update_input_feat(). In my understanding, V3 does not use GQA, so I would advice (as a patch) setting is_gqa=False in the function call from smoothquant.py. Hope it helps.

[Note: I'm not an author in this repo, just an active user]

ok, thanks

taishan1994 avatar Mar 28 '25 08:03 taishan1994