OpenDelta
OpenDelta copied to clipboard
How to implement prefix tuning with BartForConditionalGeneration?
Thank you for the awesome work. Currently, I am trying to implement prefix-tuning experiments with BART. The original code provided by the author is a total mess.
Then I found your work here. However, I cannot find enough docs for the usage. For example, I dont know how the run experiment with PrefixModel
you provided. And I checked the source code , I haven't figure out
how it works.
Could you please give me more information about that?
I am rather confussing too, consider this.
As the attention mask is first feeded to BartDecoder, than the BartDecoderLayer, so the attention seq length is always a problem.
I deem the BART library from Huggingface need to be edited. Base on the blog