LLaMA-Factory 12/24/2023 09:04:10 - INFO - llmtuner.tuner.core.loader - trainable params: 0 || all params: 7069016064 || trainable%: 0.0000 Killed

12/24/2023 09:04:10 - INFO - llmtuner.tuner.core.loader - trainable params: 0 || all params: 7069016064 || trainable%: 0.0000 Killed

Open 1Jenifer opened this issue 6 months ago • 0 comments

Reminder

[X] I have read the README and searched the existing issues.

Reproduction

***** train metrics ***** epoch = 3.0 train_loss = 1.6394 train_runtime = 3:33:32.94 train_samples_per_second = 6.842 train_steps_per_second = 0.428 [INFO|trainer.py:2889] 2023-12-24 04:44:36,261 >> Saving model checkpoint to boolmz_translation_model [INFO|tokenization_utils_base.py:2432] 2023-12-24 04:44:36,455 >> tokenizer config file saved in boolmz_translation_model/tokenizer_config.json [INFO|tokenization_utils_base.py:2441] 2023-12-24 04:44:36,455 >> Special tokens file saved in boolmz_translation_model/special_tokens_map.json Figure saved: boolmz_translation_model/training_loss.png 12/24/2023 04:44:36 - WARNING - llmtuner.extras.ploting - No metric eval_loss to plot.

python src/export_model.py
--model_name_or_path bigscience/bloomz-7b1
--template default
--finetuning_type lora
--checkpoint_dir boolmz_translation_model
--export_dir bloomz_wmt

12/24/2023 09:02:24 - INFO - llmtuner.tuner.core.adapter - Fine-tuning method: LoRA 12/24/2023 09:04:10 - INFO - llmtuner.tuner.core.adapter - Merged 1 model checkpoint(s). 12/24/2023 09:04:10 - INFO - llmtuner.tuner.core.adapter - Loaded fine-tuned model from checkpoint(s): boolmz_translation_model 12/24/2023 09:04:10 - INFO - llmtuner.tuner.core.loader - trainable params: 0 || all params: 7069016064 || trainable%: 0.0000 Killed

Expected behavior

前面指令微调的部分是正常的，但在保存模型的时候12/24/2023 09:04:10 - INFO - llmtuner.tuner.core.loader - trainable params: 0 || all params: 7069016064 || trainable%: 0.0000 Killed，出现了这个情况，将数据类型修改成bf16后还是不行

System Info

transformers version: 4.36.2
Platform: Linux-5.4.0-136-generic-x86_64-with-glibc2.17
Python version: 3.8.10
Huggingface_hub version: 0.20.1
Safetensors version: 0.4.1
Accelerate version: 0.25.0
Accelerate config: not found
PyTorch version (GPU?): 2.0.0+cu118 (True)
Tensorflow version (GPU?): not installed (NA)
Flax version (CPU?/GPU?/TPU?): not installed (NA)
Jax version: not installed
JaxLib version: not installed
Using GPU in script?:
Using distributed or parallel set-up in script?:

Others

No response

Dec 24 '23 06:12 1Jenifer

LLaMA-Factory LLaMA-Factory copied to clipboard

12/24/2023 09:04:10 - INFO - llmtuner.tuner.core.loader - trainable params: 0 || all params: 7069016064 || trainable%: 0.0000 Killed

Reminder

Reproduction

Expected behavior

System Info

Others

LLaMA-Factory
LLaMA-Factory copied to clipboard