Bug when load model for GRPO Training without PEFT
The issue #1632 is no longer fixed. If I run the Qwen2.5_(3B)-GRPO.ipynb and comment out
# model = FastLanguageModel.get_peft_model(
# model,
# r = lora_rank, # Choose any number > 0 ! Suggested 8, 16, 32, 64, 128
# target_modules = [
# "q_proj", "k_proj", "v_proj", "o_proj",
# "gate_proj", "up_proj", "down_proj",
# ], # Remove QKVO if out of memory
# lora_alpha = lora_rank,
# use_gradient_checkpointing = "unsloth", # Enable long context finetuning
# random_state = 3407,
# )
then you get the LLMEngine should not be pickled! error when you train:
---------------------------------------------------------------------------
RuntimeError Traceback (most recent call last)
[<ipython-input-7-f0c0f43b49a3>](https://localhost:8080/#) in <cell line: 0>()
----> 1 trainer = GRPOTrainer(
2 model = model,
3 processing_class = tokenizer,
4 reward_funcs = [
5 xmlcount_reward_func,
13 frames
[/usr/local/lib/python3.11/dist-packages/vllm/engine/llm_engine.py](https://localhost:8080/#) in __reduce__(self)
500 # This is to ensure that the LLMEngine is not referenced in
501 # the closure used to initialize Ray worker actors
--> 502 raise RuntimeError("LLMEngine should not be pickled!")
503
504 def __del__(self):
RuntimeError: LLMEngine should not be pickled!
I am unsure what the fix is, if you point me in the right direction I am happy to investigate.
@Erland366 Could you check if vLLM works still if no LoRA adapters are added? I think you also had a PR on moving load_lora outside of get_peft_model
Sorry for the very late reply. I finally able to get back into Unsloth stuff
I don't think you can train non-LoRA model using Unsloth in general. When I tested for inference, yes it works.
@Erland366 Is there any plan to add support for training? If not I think you should make it more clear that unsloth only works in combination with LoRA adaptors.
Sorry for the very late reply. I finally able to get back into Unsloth stuff
I don't think you can train non-LoRA model using Unsloth in general. When I tested for inference, yes it works.
Is that indeed so? Only lora model with Unsloth?
请问后续怎么样了,你能用unsloth训练非lora模型吗
请问后续怎么样了,你能用unsloth训练非lora模型吗
@qwerty3564 好像不支持非lora啊