Aflah
Aflah
Thank you for taking out the time! That would be really helpful!
Thanks a lot for sharing this! I need to install from source and then try this right? I'll do this by tomorrow and let you know!
Sorry for the delay @merrymercy Thanks a lot! This works really well. Leaving the issue open though as it seems there's another ongoing discussion, but my original issues have been...
@merrymercy Yep it's a very significant speed up over vllm for my usecase :) Thanks for this library. My only pain point is some of the models I'm using are...
Thanks! I'll take a look at this
Encountered the same - The tolerance level in the tutorial did not work for me as well and I had to increase the tolerance level by approx 5x to have...
> I'm not using APEX too. But the code cannot run successfully. I encountered the same error message: `_layer_norm_fwd_fused() got an unexpected keyword argument 'num_ctas'.` > > **How did you...
> > > I'm not using APEX too. But the code cannot run successfully. I encountered the same error message: `_layer_norm_fwd_fused() got an unexpected keyword argument 'num_ctas'.` > > >...
> > > > > I'm not using APEX too. But the code cannot run successfully. I encountered the same error message: `_layer_norm_fwd_fused() got an unexpected keyword argument 'num_ctas'.` >...
@onlyone2019 Thanks for sharing this! Maybe something to do with the GPU since it seems to be outperforming on a A30 but the gains are much less on a T4