peft icon indicating copy to clipboard operation
peft copied to clipboard

File update: use LoRa with PEFT and DeepSpeed documentation is already lagging behind

Open pengjunfeng11 opened this issue 8 months ago • 2 comments

Feature request

Here is the link of doc:https://huggingface.co/docs/peft/accelerate/deepspeed#use-peft-qlora-and-deepspeed-with-zero3-for-finetuning-large-models-on-multiple-gpus

Recently, I'm preparing to use PEFT and DeepSpeed ​​for LoRa fine-tuning, and above is official doc of guide, but it's not suit for current version accelerate frame. This doc still use some early version of accelerate such as follow: Image In the latest version of acctlerate, we can create ds_config.json follow the DeepSpeed official guide, and specify the ds_config.json file path to config DeepSpeed in accelerate.

We should draw a new document to help people who want use LoRa with PEFT and DeepSpeed.

Motivation

help PEFT community better

Your contribution

Yes, I'd like to help, such as write a new doc and prepare some code example.If any thing I can help, please let me know.

pengjunfeng11 avatar Mar 29 '25 08:03 pengjunfeng11

@pengjunfeng11 Thanks for raising this issue. If you are interested, please feel free to create a PR to update the docs. If you believe the examples also need to be updated, you can add those updates too.

BenjaminBossan avatar Mar 31 '25 09:03 BenjaminBossan

I will try this, glad to be help!!

pengjunfeng11 avatar Mar 31 '25 09:03 pengjunfeng11

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

github-actions[bot] avatar Apr 28 '25 15:04 github-actions[bot]

@pengjunfeng11 Are you still interested in working on this?

BenjaminBossan avatar Apr 28 '25 15:04 BenjaminBossan

@pengjunfeng11 Are you still interested in working on this?

Sorry, I forgot about this for a while. I will start maintaining this document recently.

pengjunfeng11 avatar Apr 28 '25 15:04 pengjunfeng11

@pengjunfeng11 Thanks for raising this issue. If you are interested, please feel free to create a PR to update the docs. If you believe the examples also need to be updated, you can add those updates too.

I have a question, there is a lot of relevant content about accelerate in deepspeed.md, but the fact is that the accelerate framework can be completely unnecessary. I personally hope to distinguish these two parts. What do you think? If I follow my ideas, I will delete a lot of content about accelerate in this document, because I think those who want to learn this document just want to use deepspeed to quickly start peft training.

pengjunfeng11 avatar May 04 '25 05:05 pengjunfeng11

@pengjunfeng11 Thanks for raising this issue. If you are interested, please feel free to create a PR to update the docs. If you believe the examples also need to be updated, you can add those updates too.

I have a question, there is a lot of relevant content about accelerate in deepspeed.md, but the fact is that the accelerate framework can be completely unnecessary. I personally hope to distinguish these two parts. What do you think? If I follow my ideas, I will delete a lot of content about accelerate in this document, because I think those who want to learn this document just want to use deepspeed to quickly start peft training.

Because when I first started trying to use deepspeed to cooperate with transformers for lora training, the content in the tutorial made me very confused. deepspeed official gave a link to this document, but our document mentioned accelerate, just like a layer of onions, which is not very friendly to novices.

pengjunfeng11 avatar May 04 '25 06:05 pengjunfeng11

Thanks for investigating this further. I wouldn't delete the accelerate content. Although it is true that DeepSpeed can be used without accelerate, in practice many people do use it with accelerate. This can be more convenient for users and also allows to more easily adopt other techniques, like FSDP. However, I can see the argument for more clearly highlighting that accelerate is not necessary for DeepSpeed, and to add a separate section about using PEFT with DeepSpeed and without accelerate.

BenjaminBossan avatar May 05 '25 10:05 BenjaminBossan

Yeah, that's what I mean. I will draw a new section just for deepspeed and PEFT , and keep accelearate part.

Or do you think there are any areas that could be further optimized or additional content that needs to be included in guide? Please let me know.

pengjunfeng11 avatar May 05 '25 10:05 pengjunfeng11

I will draw a new section just for deepspeed and PEFT , and keep accelearate part.

Thanks.

Or do you think there are any areas that could be further optimized or additional content that needs to be included in guide? Please let me know.

I'm sure there are other parts that could be improved, but I would recommend to focus on one task per PR, the rest can be dealt with later.

BenjaminBossan avatar May 05 '25 10:05 BenjaminBossan

I will draw a new section just for deepspeed and PEFT , and keep accelearate part.

Thanks.

Or do you think there are any areas that could be further optimized or additional content that needs to be included in guide? Please let me know.

I'm sure there are other parts that could be improved, but I would recommend to focus on one task per PR, the rest can be dealt with later.

got it

pengjunfeng11 avatar May 05 '25 10:05 pengjunfeng11

I have finished modifying the content in the ## Configuration section. The remaining work is to go through and update each of the examples one by one. Before I proceed, I'd like you to check if there's anything else that still needs to be revised.Hope we can have a discussion.

Here is the commit link in my fork

https://github.com/pengjunfeng11/peft/commit/2cc16793bc22a3536d6d8eddc3dd07192aeeb561

pengjunfeng11 avatar May 08 '25 17:05 pengjunfeng11

@pengjunfeng11 please create a pull request with your changes to that we can review them :)

githubnemo avatar May 26 '25 15:05 githubnemo

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

github-actions[bot] avatar Jun 20 '25 15:06 github-actions[bot]