LightCompress issues

quant_type setting differ in dgq and spqr

it is int_quant/ float_quant in these two methods, but int-quant/float-quant in samples and other methods, should they be aligned or in purpose?

WhoIsYu

support for new datatypes?

Hi, There are some old or new datatypes like mx6/mx9/mxfp4 and nvfp4_e2m1, recently nvidia has support the quantization of DeepSeek-R1 model: https://huggingface.co/nvidia/DeepSeek-R1-FP4 can llmc support such new datatypes in the...

guanchenl

Error while enabling transformed eval_pos

I'm trying to test this with RTN. But while named eval for transformed I'm getting following datatype error. This is my config file ```base: seed: &seed 42 model: type: Llama...

dittops

HumanEval benchmark producing unreasonably low scores

2

Hello, LLMC team! I was glad to discover that you added support to `HumanEval` benchmark, However, when trying to produce sanity results for unquantized models, they were significantly lower than...

sasha-hailo

如何加载fp8模型作为Dense model？

1

使用llmc的awq_fp8.yml获取了一个fp8的llama3，如何将其作为Dense模型传入，继续进行int8量化？目前尝试直接修改model path，但ppl测试结果不合理。 ![Image](https://github.com/user-attachments/assets/05b6889a-5719-4618-ab70-48b549809e8d)

AboveParadise

TWO BUGS in function get_qparams() in quant.py

Dear LLMC team, Thank you for your continuous effort to maintain this useful repo. I've been using it for a long period, and would like to draw your attention to...

sasha-hailo

量化QWen2.5-32B的时候OOM

3

在replace-model阶段会OOM可能是什么问题呢模型是QWen32B，用的是7张A6000 nvidia-smi显示启动了7张卡，但是模型好像只放在gpu0 ![Image](https://github.com/user-attachments/assets/b59fad31-8890-43ea-869f-f809f7985c66) 配置文件： ``` base: seed: &seed 42 model: type: Qwen2 path: ./DeepSeek-R1-Distill-Qwen-32B tokenizer_mode: slow torch_dtype: auto # calib: # name: pileval # download: True # path: ./LLMCompress/data...

Wowoho

Inquiry About Support for Evaluating AWQ Quantized Models

1

Hi, Thank you for your great work on this repository! I am currently exploring the use of AWQ for model quantization and was wondering if the repository supports evaluating models...

donghong1

Nccl timeout occurs in a multi-GPUs task.

**When I perform the multi-GPU evaluation for DeepSeekv3, the following error message is displayed:** 2025-02-21 16:29:40.579 | INFO | llmc.eval.eval_base:__init__:21 - eval_cfg : {'eval_pos': ['pretrain', 'transformed', 'fake_quant'], 'name': 'wikitext2', 'download':...

QingshuiL

will auto-round be included?

have you evaluated this algorithm and any plan to support it?

WhoIsYu

LightCompress
LightCompress copied to clipboard

Metadata

quant_type setting differ in dgq and spqr

support for new datatypes?

Error while enabling transformed eval_pos

HumanEval benchmark producing unreasonably low scores

如何加载fp8模型作为Dense model？

TWO BUGS in function get_qparams() in quant.py

量化QWen2.5-32B的时候OOM

Inquiry About Support for Evaluating AWQ Quantized Models

Nccl timeout occurs in a multi-GPUs task.

will auto-round be included?

← Metadata

Owner

Metadata

LightCompress LightCompress copied to clipboard

Metadata

← Metadata

Owner

Metadata

LightCompress
LightCompress copied to clipboard