YI-LIN SUNG
YI-LIN SUNG
Hi, In this code, the few shot classes of testing data are same as which of validation data if the random seed is specified. Therefore, the separation is not maintained...
Thanks for pointing out the issue. I remember I didn't have this issue when I tried DDP. I will check on this soon.
I just found out that DDP works well with full fine-tuning but works worse with parameter-efficient transfer learning methods. I will further investigate this issue soon.
Thanks for reporting this. I didn't try T5.1.1 before. I will look into this when I am not that swamped.... Probably some time at the end of November.
Hi @hosseinbv, thank you for pointing out the issue. I didn't try the multi-GPU training in my experiments, so there might be some problems. Feel free to send PR if...
I think reduce the batch size should work, but the learning rate might need to reduce accordingly. The performance drops from @prote376 experiments may still come from the multi-gpu problem...
I would love to, and I do have a simpler implementation right now. But currently, I don't have a concrete timeline for when I can contribute to the peft lib,...
In order to let the model learn to generate all possible answers based on the questions, we choose to randomly select an answer from the answer list for simplicity. This...
Hi, It seems like there is a an extra $ in your command. Can you try `bash scripts/baseline.sh "1" "cola"` and see if it works? Thank you.
Have you tried the fix mentioned in my previous comment? _It seems like there is a an extra $ in your command. Can you try `bash scripts/baseline.sh "1" "cola"` and...