SuJeong
Results
2
comments of
SuJeong
Thank you so much for your reply! As you mentioned, it seems like using `--lora` might have been the issue. Also, I didn’t increase the learning rate, which probably made...
Thanks a lot! Then, in the embedding task, is it correct to consider the number of in-batch negatives as 255 (= 32 (`--per_device_train_batch_size`) * 8 GPUs - 1)?