Masaki Kozuki
Masaki Kozuki
## What does this PR do? As per title, this PR adds `F.scaled_mm` to `thunder.torch` and cover it with torchex impl. Ref: https://docs.pytorch.org/docs/main/generated/torch.nn.functional.scaled_mm.html
## What does this PR do? Taking over #2182. This enables us to dump `TraceCtx` by setting `TORCH_TRACE` env var if we use `thunder.dynamo.compiler.thunderfx`. What are stored? - prologue, computation,...
Before submitting - [ ] Was this discussed/approved via a Github issue? (no need for typos and docs improvements) - [ ] Did you read the [contributor guideline](https://github.com/Lightning-AI/pytorch-lightning/blob/main/.github/CONTRIBUTING.md), Pull Request...
## What does this PR do? As per title. Related #2218 cc @borda
## What does this PR do? As per title, this adds cutlass python dsl executor. In this PR, the kernels defined in https://github.com/Dao-AILab/quack, except matmul, are registered. Also, backward is...
## What does this PR do? As per title. This PR adds `torch.nn.functional.scaled_grouped_mm` support. ref: https://docs.pytorch.org/docs/main/generated/torch.nn.functional.scaled_grouped_mm.html
of the equivalent semantics as SGLang's custom `nn.Module`s