Logan Adams
Logan Adams
They are built from the same git tag, the windows whl is just built for windows, and contains the ops pre-built. More information on the Windows whl and Windows support...
Hi @ajindal1 - I am trying to repro this but hitting an issue. First I wanted to confirm that in a venv, if you following the following steps that you...
Interesting, especially that you have this issue even if you don't have deepspeed installed.
> @loadams Just fail on this assert when using lamb with bf16. May I ask if this will keep going? Hi @Liangliang-Ma - apologies, I lost track of this PR....
> > @loadams Just fail on this assert when using lamb with bf16. May I ask if this will keep going? > > Hi @Liangliang-Ma - apologies, I lost track...
Failing HPU tests are a transformers issue that should be fixed in transformers soon.
> Hi, may I know this upgrading triton will also adjust triton kernels? #4857 > > Thanks! Hi @YizhouZ - the main goal here was just to add what is...
> @nelyahu Do we have any update? ping @nelyahu
@nelyahu - closing this as stale for now, happy to come back to it, just re-open or tag us
> @inkcherry thanks, I have no further questions. Hi @tjruwase @loadams this PR is to enable sequence parallel for model with number of heads not power of two, which is...