Thomas Viehmann

Results 227 comments of Thomas Viehmann

> Note that, before passing the trace to the nvFuser executor, prims.copy_to_out_(t0, out=q) is put after the sdpaex operator thanks to functionalization. Ugh. Could it be that `inplace_copy_` is particular...

closing for now @crcrpar please reopen as you see fit.

The checker should also check that all proxies it finds are in the .names set.

Hi @rittik9 , thank you for working on this! So two quick comments and let me know how much you want to go into details or explore yourself: - this...

I like the proposal in general, a couple of details: - for the "do nothing" I wonder if empty lists or tuples would be better, - completely agree with manually...

@AugustDev Thank you, did you want to file this here or with https://github.com/Lightning-AI/pytorch-lightning/issues ?

Let's not. Optmizing without measurable impact is not a habit we want to get into.

TBH, this is a very clear "don't do this, chaning the fn is completely unsupported!". That said, we can talk about distributed-after-jit. The obstacles are: - Currently the ddp transformation...

I think the new fsdp/ddp actually do this.

Thank you for pinging @mtasic85 . We're looking into more fp8 support, but we likely want to deliver this through [Thunder](https://github.com/lightning-ai/lightning-thunder/), which will compile models to use optimizations. We do...