soft-prompt-tuning did it doenst need backpropagation process？

did it doenst need backpropagation process？

Open qyccc opened this issue 3 years ago • 4 comments

I dont find any backpropagation process in the code. I'm curious about how the stochastic embedding be optimized. I wonder Am I misunderstanding the paper or the code is uncompeleted? Thanks for answer my question

Sep 02 '21 02:09 qyccc

Yes, I find this problem too. When updating the model, the learned embedding remains unchanged ! I also wonder how to update the soft embedding...

Sep 06 '21 12:09 ghost

I'd also like to know this. As-is it looks like the code just feeds the original embedding (or another model's embedding) back into the model, which doesn't sound right.

Nov 03 '21 03:11 ekoenitz

PyTorch handles all the backpropagation process, you just need to specify which parameters you want to update.

model.set_input_embeddings(s_wte)
#after updating the embedding, specify that you want to train the learned embedding
optimizer = optim.Adam([model.transformer.wte.learned_embedding])

Also, I'm not passing a reference to the original embedding, just initializing the learning embedding to the original embedding and cloning the weights (hopefully for a better initialization), the paper does it somewhat differently, but I think it's the same idea.

Nov 03 '21 14:11 kipgparker

I think it is better to freeze some parameters to reduce gradient computations. Use something like https://discuss.huggingface.co/t/how-to-freeze-layers-using-trainer/4702/3

Nov 11 '21 03:11 hguan6

soft-prompt-tuning soft-prompt-tuning copied to clipboard

did it doenst need backpropagation process？

soft-prompt-tuning
soft-prompt-tuning copied to clipboard