hoshi-hiyouga

Results 10 issues of hoshi-hiyouga

Dear @younesbelkada @pacman100, I got a Runtime Error when training a model with accelerate when gradient checkpointing was enabled ```sh accelerate launch main.py --args ``` ```python model.enable_input_require_grads() model.gradient_checkpointing_enable() ``` >...

支持Alpaca等指令数据集的SFT和RLHF流程:https://github.com/hiyouga/LLaMA-Efficient-Tuning LoRA微调可在单块3090 GPU上运行,同时支持QLoRA方法。(最低12G显存) 微调模型的 LoRA 权重:https://huggingface.co/hiyouga/baichuan-7b-sft 运行以下指令即可实现 Alpaca 数据集指令微调(instruction-tuning): ```bash CUDA_VISIBLE_DEVICES=0 python src/train_sft.py \ --model_name_or_path baichuan-7B模型文件夹路径或huggingface地址 \ --do_train \ --dataset alpaca_gpt4_zh \ --finetuning_type lora \ --lora_rank 8 \ --lora_target W_pack...

Hello, thank you very much for such excellent work. We have conducted some experiments using Llama-Factory, and the results indicate that Galore can significantly reduce memory usage during full parameter...

Hi, we have performed a small experiment on fine-tuning the Llama-2-70B-AQLM-2Bit model using the PEFT QLoRA method. We utilized the Alpaca and Glaive datasets for instruction tuning, and the fine-tuned...

https://huggingface.co/docs/trl/main/en/dpo_trainer#downsides-to-merging-qlora-before-dpo-approach-2 https://github.com/jondurbin/qlora/blob/main/qmerge.py

pending

``` loss 3.28 = 3.28 + 0.0 avg prob of [Rishi Sunak] 0.0498 loss nan = nan + nan avg prob of [Rishi Sunak] nan loss nan = nan +...

solved

Get Yuuka-chan closer to me Before: ![Q20231130174536](https://github.com/snowmeow2/Blue-arXiv-Theme/assets/16256802/cf30cedf-1d33-40ca-9faf-5ac6193482c3) After: ![20231130110423](https://github.com/snowmeow2/Blue-arXiv-Theme/assets/16256802/0b4af633-2785-43c7-a556-b4ec23b3334b)

Numerous improvements have been made to the fine-tuning tools, rendering the contents of the two files obsolete. Consequently, I have revised these files to ensure their content remains fresh and...