LightCompress icon indicating copy to clipboard operation
LightCompress copied to clipboard

[EMNLP 2024 & AAAI 2026] A powerful toolkit for compressing large models including LLMs, VLMs, and video generative models.

Results 47 LightCompress issues
Sort by recently updated
recently updated
newest added

doc中要求自定义数据集为txt格式,每行为一个文本 ``` calib: name: custom download: False load_from_txt: True path: # Custom dataset, ending with txt as suffix n_samples: 128 bs: -1 seq_len: 512 preproc: random_truncate_txt seed: *seed ``` 运行报错:...

Very good work, but I have some questions to consult. When I tried to run the code, I encountered the following error. [rank0]: Traceback (most recent call last): [rank0]: File...

Dear LLMC team, Thank you for introducing support to KV cache quantization, I found it really useful! I'd like to share two observations that affect the correctness and efficiency of...

Dear LLMC team, I've been trying to run mixed-precision PTQ quantization using RTN. I suspect there's a bug, as the **non-default settings in `mix_bits` are ignored**. My understanding of the...

bug

### 报错日志: [W416 15:37:25.363094293 socket.cpp:933] [c10d] The server socket on [::ffff:36.xxx.xxx.13]:40613 has timed out, will retry. [W416 15:39:40.531073896 socket.cpp:933] [c10d] The server socket on [::ffff:36.xxx.xxx.13]:40613 has timed out, will retry....

https://github.com/ModelTC/llmc/blob/main/llmc/compression/quantization/quant.py Line 629: deficiency = self.group_size - tensor.shape[1] % self.group_size tensor.shape[-1] is ok?

As far as I am concerned, since there are only four calibration dataset ('pileval', 'c4', 'wikitext2', and 'ptb'), which only support text-modal LLMs, I would like to ask how to...

I'm encounting some issues on saving/loading fake quant. I'm trying to save with an `save_fake` option and load it again to check. How can I load fake quant model in...

When I use the following configuration file: ```yaml base: seed: &seed 42 model: type: DeepseekV3 path: xxx tokenizer_mode: fast torch_dtype: torch.float8_e4m3fn calib: name: pileval download: False path: xxx n_samples: 128...

I'm encountering an Out-of-Memory (OOM) error when trying to run the second step (OmniQuant) of the AWQ + OmniQuant combination quantization method. This happens despite having allocated 2 A40 GPUs...