Pedro Gabriel Gengo Lourenço
Pedro Gabriel Gengo Lourenço
After a long debugging, I found the issue. As we are not defining the argument names on https://github.com/huggingface/setfit/blame/cbc01ec402e86ca04e5e40e9bce7f618f3c2946c/src/setfit/exporters/onnx.py#L50 transformers library assumes that the third argument represents `position_ids`and not `token_type_ids` (https://github.com/huggingface/transformers/blob/b09912c8f452ac485933ac0f86937aa01de3c398/src/transformers/models/mpnet/modeling_mpnet.py#L515-L525)....
https://colab.research.google.com/drive/19xE4WdxqGLLZOSanycYfUzbcAxFgpzuR?usp=sharing Here is my colab that I used to debug
@nitrosocke Do you mind to share the process you did to get these results? Im trying to use only LORA to finetune SD, but the results were not that good.
@nitrosocke Did you try with some unknown subjects? My problem is when Im trying to finetune one with images from myself (as you may know, Im not a celeb haha)...
@amrrs I did this Colab notebook if you want to perform all the steps on Colab: https://colab.research.google.com/drive/1iSFDpRBKEWr2HLlz243rbym3J2X95kcy?usp=sharing @cloneofsimo If you like, you can update the README with it :)
Sorry, I didn't test with batch_size=5. I was using 1 during my experiments. What you can do is use 1 and the gradient_step can be 5, so you update the...
@amerkay @Daniel-Kelvich I just updated the Colab with Gradient Accumulation Steps! Enjoy :)
Sure thing! I can to it until EOD
@cloneofsimo I just updated the colab and as a workaround to gradient accumulation I'm making `lr / batch_size`. I know it is not the best approach but it is better...