fsdp_qlora
fsdp_qlora copied to clipboard
how to inference using 70b? or we need to implement it with the same way to train it by ourself?
Have you implemented this yet? Can you share a wave?