Tom Fogal
Tom Fogal
Appreciate the ping! Personal life's a bit crazy right now, and that's the only time I'd have for this. What kind of deadline do we have before it gets dropped?
@vedaanta I think the [test failures](https://dev.azure.com/Lightning-AI/lightning/_build/results?buildId=202984&view=logs&jobId=5b0799f7-725e-5b16-9b83-c0a5a25d03f0&j=5b0799f7-725e-5b16-9b83-c0a5a25d03f0&t=97651ec4-0b0f-5455-bbb5-3c30427a0a7e) are real / caused by this patch: ``` jfn = thunder.jit(module) > result = jfn(*args, **kwargs) thunder/tests/test_jit_general.py:608: _ _ _ _ _ _ _...
The failures are distributed notebook flakiness; unrelated, so marking ready for review. I didn't understand the #342 comment about not making this a symbol. Without the `@torchsymbol` decorator, `clone` calls...
The test failures are for distributed things: https://dev.azure.com/Lightning-AI/lightning/_build/results?buildId=202766&view=logs&j=b97dbf6d-98bd-5b68-7c01-878b39c3da28&t=3c72ede2-92c1-5cd2-2bac-ad2411af2aea&l=306 which seem unrelated. Let's try merging main into this...
The failures are just the distributed tests failing, issue #432. This is ready for re-review / merging.
poke @mruberry @t-vi for review/merging
triage review: - Nik's new executor has allow list / deny list that tags whether an op is cuda graph-able - @nikitaved what's the current status of the executor? -...
cc @eqy re: fragmentation lunch discussion
Thanks for identifying latest status, Jingyue! > Good news: these number mismatches no longer show up after I resync. 🎉 > Bad news: distributed tests [start to fail](https://dev.azure.com/Lightning-AI/lightning/_build/results?buildId=205296&view=logs&j=b97dbf6d-98bd-5b68-7c01-878b39c3da28&t=3c72ede2-92c1-5cd2-2bac-ad2411af2aea). 😢 >...
Hey Kaeun, sorry for the slow reply on this one! Must have slipped through. We'd *love* to have you tackle `TensorBase.div`---thank you so much! Let's just make sure we leave...