Kaushik Srivatsan

Results 3 comments of Kaushik Srivatsan

Hey @SkalskiP, The main reason I dropped the device argument completely is that I felt having two arguments that dealt with handling devices `device` and `device_map` might confuse the user...

@SkalskiP Thank you for following up on the issue really quickly. I think the solution you proposed broadly will work along with the one by @Matvezy to resolve this issue....

@SkalskiP Yes will be happy to. On a note unrelated to this, i was wondering whether it would be good if the load model can take in flash attention as...