LLaMA-Factory 量化卡住了

量化卡住了

Open Chenghao-Jia opened this issue 3 months ago • 8 comments

Reminder

[X] I have read the README and searched the existing issues.

Reproduction

CUDA_VISIBLE_DEVICES=4 python src/export_model.py \
    --model_name_or_path saves/export/Oral_calculation/1-grade/Qwen1.5-4B-Chat/SFT_2024-03-08 \
    --template qwen \
    --finetuning_type lora \
    --export_dir saves/export/Oral_calculation/1-grade/Qwen1.5-4B-Chat/Qwen1.5-4B-Chat-4bit \
    --export_size 8 \
    --export_legacy_format False \
    --export_quantization_bit 4 \
    --export_quantization_dataset data/Oral_calculation_1_grade_quantifited.json

一直卡在这里

(llama_factory) chenghao@auc:~/workspace/LLaMA-Factory$ bash bash/quantization.bash
[2024-03-09 14:52:56,189] [INFO] [real_accelerator.py:191:get_accelerator] Setting ds_accelerator to cuda (auto detect)
[INFO|tokenization_utils_base.py:2044] 2024-03-09 14:52:58,258 >> loading file vocab.json
[INFO|tokenization_utils_base.py:2044] 2024-03-09 14:52:58,259 >> loading file merges.txt
[INFO|tokenization_utils_base.py:2044] 2024-03-09 14:52:58,259 >> loading file added_tokens.json
[INFO|tokenization_utils_base.py:2044] 2024-03-09 14:52:58,259 >> loading file special_tokens_map.json
[INFO|tokenization_utils_base.py:2044] 2024-03-09 14:52:58,259 >> loading file tokenizer_config.json
[INFO|tokenization_utils_base.py:2044] 2024-03-09 14:52:58,259 >> loading file tokenizer.json
[WARNING|logging.py:314] 2024-03-09 14:52:58,465 >> Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
[INFO|configuration_utils.py:726] 2024-03-09 14:52:58,467 >> loading configuration file saves/export/Oral_calculation/1-grade/Qwen1.5-7B-Chat/SFT_2024-03-08/config.json
ated word embeddings are fine-tuned or trained.
[INFO|configuration_utils.py:726] 2024-03-09 14:52:58,467 >> loading configuration file saves/export/Oral_calculation/1-grade/Qwen1.5-7B-Chat/SFT_2024-03-08/config.json
[INFO|configuration_utils.py:791] 2024-03-09 14:52:58,468 >> Model config Qwen2Config {
  "_name_or_path": "saves/export/Oral_calculation/1-grade/Qwen1.5-7B-Chat/SFT_2024-03-08",
  "architectures": [
    "Qwen2ForCausalLM"
  ],
  "attention_dropout": 0.0,
  "bos_token_id": 151643,
  "eos_token_id": 151645,
  "hidden_act": "silu",
  "hidden_size": 4096,
  "initializer_range": 0.02,
  "intermediate_size": 11008,
  "max_position_embeddings": 32768,
  "max_window_layers": 28,
  "model_type": "qwen2",
  "num_attention_heads": 32,
  "num_hidden_layers": 32,
  "num_key_value_heads": 32,
  "rms_norm_eps": 1e-06,
  "rope_theta": 1000000.0,
  "sliding_window": 32768,
  "tie_word_embeddings": false,
  "torch_dtype": "bfloat16",
  "transformers_version": "4.38.2",
  "use_cache": true,
  "use_sliding_window": false,
  "vocab_size": 151936
}

Generating train split: 28903 examples [00:00, 36505.64 examples/s]

Expected behavior

No response

System Info

No response

Others

No response

Mar 09 '24 07:03 Chenghao-Jia

    PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND                                       
1303860 chenghao  20   0   13.7g 879292 318120 R  99.7   0.2  31:08.67 python                                        
1304172 chenghao  20   0   13.1g 836564 316708 R  99.7   0.2  27:35.24 python

能看到一直在进行，想了解一下大概要多久

Mar 09 '24 07:03 Chenghao-Jia

最新发现，使用--export_quantization_dataset data/c4_demo.json就可以，使用自己的数据集比如--export_quantization_dataset data/xxxx.json就不行

Mar 10 '24 15:03 Chenghao-Jia

一直卡在这里，正常嘛？

Mar 11 '24 11:03 PangziZhang523

一直卡在这里，正常嘛？

我也卡在这里了。。。你解决了吗

Mar 11 '24 15:03 Fred199683

最新发现，使用--export_quantization_dataset data/c4_demo.json就可以，使用自己的数据集比如--export_quantization_dataset data/xxxx.json就不行

这里数据集的格式有什么讲究吗

Mar 11 '24 15:03 Fred199683

一直卡在这里，正常嘛？

不正常，应该是可以跑通的

Mar 12 '24 00:03 Chenghao-Jia

最新发现，使用--export_quantization_dataset data/c4_demo.json就可以，使用自己的数据集比如--export_quantization_dataset data/xxxx.json就不行

这里数据集的格式有什么讲究吗

目前来看就用c4_demo就行，不知道有什么讲究

Mar 12 '24 01:03 Chenghao-Jia

最新发现，使用--export_quantization_dataset data/c4_demo.json就可以，使用自己的数据集比如--export_quantization_dataset data/xxxx.json就不行

这里数据集的格式有什么讲究吗

目前来看就用c4_demo就行，不知道有什么讲究

你有尝试过AutoGPTQ官方的脚本来做量化吗？我对Qwen1.5做过量化，不会卡住，但是loss为nan

Mar 12 '24 02:03 Fred199683

一直卡在这里，正常嘛？

我也卡在这里了。。。你解决了吗

@Fred199683 @PangziZhang523 @Chenghao-Jia 请问这个如何解决

May 05 '24 12:05 tkone2018

LLaMA-Factory LLaMA-Factory copied to clipboard

量化卡住了

Reminder

Reproduction

Expected behavior

System Info

Others

LLaMA-Factory
LLaMA-Factory copied to clipboard