Masaki Kozuki

Results 42 issues of Masaki Kozuki

Adding the new argument of `apply_rate_limting` to `thunder.distributed.fsdp` so that we can try rate limiting of AllGather, for especially when ZERO2 is used. The major changes this pr brings are...

distributed

## What does this PR do? Fixes #184 As per title. The changes are addition of a logic to tell whether or not the input `TraceCtx` represents DDP backward. cc...

distributed