Mihir Patel
Mihir Patel
Closing as out of date
@dakinggg @irene-dea can you please look? I agree we should have an option for this. I'm not sure if it's necessary to pass to Composer vs. check if its an...
> @mvpatel2000 I think trainer arg is right for this...code looks fine at a glance, would want to test a bit more before merging. You will own testing?
@aadyotb Just a heads up Daniel is out this week, and given the subtlety here, I would prefer he finish the review of this PR vs. bringing someone else to...
> Almost looks good to me. just to clarify, do you want to add unit test for HSDP + TP checkpointing? Yep, working on adding requested unit tests. > Since...
> > Is this just pulling out existing code into a helper fn? > > @eracah it would be great to get slightly more description so I know what parts...
@wolliq woud you mind sharing your YAML? are you saying you directly pass run name to mlflow logger but it is always overridden?
CC: @eracah @dakinggg
@Skylion007 do you think this is a torch error or something we can do differently?
@Ghelfi do you know if this works for you elsewhere, e.g. if you compile outside Composer? Will help us narrow down if its a Composer issue or PyTorch issue, as...