LLaMA-Factory
LLaMA-Factory copied to clipboard
量化卡住了
Reminder
- [X] I have read the README and searched the existing issues.
Reproduction
CUDA_VISIBLE_DEVICES=4 python src/export_model.py \
--model_name_or_path saves/export/Oral_calculation/1-grade/Qwen1.5-4B-Chat/SFT_2024-03-08 \
--template qwen \
--finetuning_type lora \
--export_dir saves/export/Oral_calculation/1-grade/Qwen1.5-4B-Chat/Qwen1.5-4B-Chat-4bit \
--export_size 8 \
--export_legacy_format False \
--export_quantization_bit 4 \
--export_quantization_dataset data/Oral_calculation_1_grade_quantifited.json
一直卡在这里
(llama_factory) chenghao@auc:~/workspace/LLaMA-Factory$ bash bash/quantization.bash
[2024-03-09 14:52:56,189] [INFO] [real_accelerator.py:191:get_accelerator] Setting ds_accelerator to cuda (auto detect)
[INFO|tokenization_utils_base.py:2044] 2024-03-09 14:52:58,258 >> loading file vocab.json
[INFO|tokenization_utils_base.py:2044] 2024-03-09 14:52:58,259 >> loading file merges.txt
[INFO|tokenization_utils_base.py:2044] 2024-03-09 14:52:58,259 >> loading file added_tokens.json
[INFO|tokenization_utils_base.py:2044] 2024-03-09 14:52:58,259 >> loading file special_tokens_map.json
[INFO|tokenization_utils_base.py:2044] 2024-03-09 14:52:58,259 >> loading file tokenizer_config.json
[INFO|tokenization_utils_base.py:2044] 2024-03-09 14:52:58,259 >> loading file tokenizer.json
[WARNING|logging.py:314] 2024-03-09 14:52:58,465 >> Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
[INFO|configuration_utils.py:726] 2024-03-09 14:52:58,467 >> loading configuration file saves/export/Oral_calculation/1-grade/Qwen1.5-7B-Chat/SFT_2024-03-08/config.json
ated word embeddings are fine-tuned or trained.
[INFO|configuration_utils.py:726] 2024-03-09 14:52:58,467 >> loading configuration file saves/export/Oral_calculation/1-grade/Qwen1.5-7B-Chat/SFT_2024-03-08/config.json
[INFO|configuration_utils.py:791] 2024-03-09 14:52:58,468 >> Model config Qwen2Config {
"_name_or_path": "saves/export/Oral_calculation/1-grade/Qwen1.5-7B-Chat/SFT_2024-03-08",
"architectures": [
"Qwen2ForCausalLM"
],
"attention_dropout": 0.0,
"bos_token_id": 151643,
"eos_token_id": 151645,
"hidden_act": "silu",
"hidden_size": 4096,
"initializer_range": 0.02,
"intermediate_size": 11008,
"max_position_embeddings": 32768,
"max_window_layers": 28,
"model_type": "qwen2",
"num_attention_heads": 32,
"num_hidden_layers": 32,
"num_key_value_heads": 32,
"rms_norm_eps": 1e-06,
"rope_theta": 1000000.0,
"sliding_window": 32768,
"tie_word_embeddings": false,
"torch_dtype": "bfloat16",
"transformers_version": "4.38.2",
"use_cache": true,
"use_sliding_window": false,
"vocab_size": 151936
}
Generating train split: 28903 examples [00:00, 36505.64 examples/s]
Expected behavior
No response
System Info
No response
Others
No response
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
1303860 chenghao 20 0 13.7g 879292 318120 R 99.7 0.2 31:08.67 python
1304172 chenghao 20 0 13.1g 836564 316708 R 99.7 0.2 27:35.24 python
能看到一直在进行,想了解一下大概要多久
最新发现,使用--export_quantization_dataset data/c4_demo.json
就可以,使用自己的数据集比如--export_quantization_dataset data/xxxx.json
就不行
一直卡在这里,正常嘛?
一直卡在这里,正常嘛?
我也卡在这里了。。。你解决了吗
最新发现,使用
--export_quantization_dataset data/c4_demo.json
就可以,使用自己的数据集比如--export_quantization_dataset data/xxxx.json
就不行
这里数据集的格式有什么讲究吗
一直卡在这里,正常嘛?
不正常,应该是可以跑通的
最新发现,使用
--export_quantization_dataset data/c4_demo.json
就可以,使用自己的数据集比如--export_quantization_dataset data/xxxx.json
就不行这里数据集的格式有什么讲究吗
目前来看就用c4_demo就行,不知道有什么讲究
最新发现,使用
--export_quantization_dataset data/c4_demo.json
就可以,使用自己的数据集比如--export_quantization_dataset data/xxxx.json
就不行这里数据集的格式有什么讲究吗
目前来看就用c4_demo就行,不知道有什么讲究
你有尝试过AutoGPTQ官方的脚本来做量化吗?我对Qwen1.5做过量化,不会卡住,但是loss为nan
一直卡在这里,正常嘛?
我也卡在这里了。。。你解决了吗
@Fred199683 @PangziZhang523 @Chenghao-Jia 请问这个如何解决