Yu-won Lee

Results 230 comments of Yu-won Lee

@baiyongrui I was using Copilot at the time, and it seems the autocomplete set it to False. I've fixed it right away, but it's unfortunate this issue has existed for...

It seems like you are using cpu offloading. That slows down the training quite much. You could adjust the settings in the deepspeed config file to put some layers back...

The settings written in the config files are not optimal for everyone. The less vram you are using, you will take more time to train. You should balance between 2...

If the original model is the same speed, that would be caused by some other problem maybe caused by other thing such as offloading, long context due to the image...

Does the original transformer forward is faster than the monkey patched one? If so, you could just use the original one. It's just for applying the liger-kernel and training mixed-modality...

Yes, but I think it's that's not the problem for the significant slow down, I'll find a way to intergrate `torch.compile` in my code. Thanks for letting me know!

I don't exactly know what you are doing, so I have no idea what is causing the error. BTW, I've implemented torch.compile but it dosen't work it flash-attention2. So I...

You mean adding projectors in between the vision-encoder and llm? Then I think that could make an huge error, becuase it dosen't project the image_embeddings properly at the first. The...

Hmm, maybe the monkey patching the liger-kernel modules could be a problem. You could try erasing those.

Sorry for the late response. If you aren't using LoRA it won't be the original one. I'll check out for it.