Harryis Wang

Results 5 issues of Harryis Wang

Hi! I encounter an issue that when doing the Step3(SFT). The function "get_accelerate_model" in qlora_model.py sets the adapter_name="lora_default". This results in an error that the trainable parameters are set to...

Hi! I encounter a bug when doing the step3 (Principle Engraving). I used the self_align_merged.json which is created with "self_align_32shards_*.jsonl" and "vicuna_dummy_data.json" to finetune the base model. However, I find...

Hello,Thanks for your awesome work and code! However, I encountered some confusion while trying to understand how you generated TGRT Self Instruction. You mentioned in the article that you first...

### Required prerequisites - [X] I have read the documentation . - [X] I have searched the [Issue Tracker](https://github.com/PKU-Alignment/safe-rlhf/issues) and [Discussions](https://github.com/PKU-Alignment/safe-rlhf/discussions) that this hasn't already been reported. (+1 or comment...

question

Hello! Thanks for your awesome work! I meet an issue when I run dpo with qlora. I notice there is a setting: ``` if model_args.use_peft is True: ref_model = None...