unify-parameter-efficient-tuning issues

load checkpoint

Thank you for the shared code. When I train to get a checkpoint, how to make the model load the parameters at this time?

xinykou

Prefix tuning with "gated add" for Roberta

Hi, thanks for publishing the paper and sharing the source code. I found that the "attn_output" is not used after definition. When learning roberta for parameter efficient learning, the paper...

GeondoPark

Question about the training curve

Hi, thanks for sharing the source code. Could you please share the training log file, i.e., `log.txt`, with me? I just encountered some training problems, and the loss score decreased...

speedcell4

The instantiation of Multi-head PA and the design choice of MAM adapter.

3

Thanks for your great work! I have read your paper, but I am a bit confused about two things. (1) The instantiation of Multi-head PA. How can we instantiate Multi-head...

JacobYuan7

how to add lora by transformers.adapter

![image](https://github.com/jxhe/unify-parameter-efficient-tuning/assets/67264710/ec02aa3d-377e-46fd-b122-e65f6bc78d59) transformers.adapter only have prefix_tuning and adapter,where is the lora

TuuSiwei

The reproduced blue score for En-Ro task is far away from the paper's report.

1

Hi, You have mentioned that you used mosesdecoder for computing bleu score, can you explain that? maybe this is the source of difference between my results and yours. Thanks.

hosseinbv

Does this unified view take attention mask into consideration?

1

I am not familiar with the theoretic derivation, but I am interested in the range of suitability of the formula。Thank you。

zwbx

unify-parameter-efficient-tuning
unify-parameter-efficient-tuning copied to clipboard

Metadata

load checkpoint

Prefix tuning with "gated add" for Roberta

Question about the training curve

The instantiation of Multi-head PA and the design choice of MAM adapter.

how to add lora by transformers.adapter

The reproduced blue score for En-Ro task is far away from the paper's report.

Does this unified view take attention mask into consideration?

← Metadata

Owner

Metadata

unify-parameter-efficient-tuning unify-parameter-efficient-tuning copied to clipboard

Metadata

load checkpoint

Prefix tuning with "gated add" for Roberta

Question about the training curve

The instantiation of Multi-head PA and the design choice of MAM adapter.

how to add lora by transformers.adapter

The reproduced blue score for En-Ro task is far away from the paper's report.

Does this unified view take attention mask into consideration?

← Metadata

Owner

Metadata

unify-parameter-efficient-tuning
unify-parameter-efficient-tuning copied to clipboard