LightCompress icon indicating copy to clipboard operation
LightCompress copied to clipboard

[EMNLP 2024 & AAAI 2026] A powerful toolkit for compressing large models including LLMs, VLMs, and video generative models.

Results 47 LightCompress issues
Sort by recently updated
recently updated
newest added

Hello, I am currently working with the `vlm_datasets` calibration data class as outlined in the documentation [here](https://llmc-zhcn.readthedocs.io/en/latest/advanced/VLM_quant%26img-txt_dataset.html). I noticed that there seems to be some aspects of the implementation that...

Hi, thanky you for your good job! Since fine-grained quantization has a significant impact on the results, is it possible to support an algorithm similar to the two-stage quantization of...

Hi! I was not able to find instructions on how to find information about the quantization when the model gets scaled, in terms of weights and activations scalers and zero...

### backend i'm trying to quantizing `InternVL2-2B` config `eval dataset` with `mme`, but met the error named:`LocalTokenNotFoundError`, it. seems like try to download dataset, but the dataset is in local....

试了一下在qwen2.5模型上用visionzip方法校准自定义数据集, 脚本运行后报错AttributeError: 'Qwen2_5_VLModel' object has no attribute 'layers' 看了一下代码,项目model里的qwen2.5VL没有def find_blocks,是继承qwen2VL的,而qwen2VL的find_blocks在接入language 时,继承的是qwen2的方法。 qwen2的find_blocks调用的是self.model.model.layers ,而 https://github.com/huggingface/transformers/blob/main/src/transformers/models/qwen2_5_vl/modeling_qwen2_5_vl.py 的代码中的模型结构不一样,self.model.model 是 class Qwen2_5_VLModel 的attribute,再往下还有一个 class Qwen2_5_VLTextModel ,这个 class 才有 layer。不知道是不是这个问题? 试图修改代码,在qwen2.5VL的代码中定义一个新的self.blocks = self.model.model.language_model.layers,但继续运行脚本还是会碰到其他embed_tokens等等一系列的AttributeError问题,是否是普遍情况呢?

yaml 配置: ```python base: seed: &seed 0 model: # type: Qwen2VL type: Qwen2_5VL path: /mnt/data/junjun.zhao/models/Qwen2.5-VL-7B-Instruct # path: /mnt/data/junjun.zhao/models/Qwen2.5-VL-3B-Instruct # tokenizer_mode: fast torch_dtype: torch.float32 calib: # name: wikitext2 # download: True...

```python weight: bit: 8 symmetric: True granularity: per_channel group_size: -1 calib_algo: mse act: bit: 8 symmetric: True granularity: per_token calib_algo: minmax special: actorder: True static_groups: False percdamp: 0.01 blocksize: 128...

For a 1k data set, when batchsize=16, the duration is >13h. But llm-compressor awq calibration takes

## 问题描述 tools/quant_analysis.py未适配最新代码,import不到BaseTokenizer等包。 ## 执行的代码或指令 ``` python /llmc/tools/quant_analysis.py \ --dataset_name wikitext2 \ --data_path /data/wikitext2 \ --model_type Qwen2VL \ --model_path /qwenVL/qwen2.5-VL-3B-Instruct \ --t_model_path /qwenVL/qwen2.5-VL-3B-Instruct-AWQ \ --wbit 4 --abit 16 \ --wgra...