John (Boyuan) Yao
John (Boyuan) Yao
As our future automatic parallelization might need to offload the checkpoint input for memory saving, I 1. Replace the origin torch checkpoint function with colossal.utils.checkpoint, which has the inference for...
# What’s New? In this PR I implement the `MetaInfo` For `torch.nn.ReLU`, and modify the memory test
# What’s New? In this PR, I fix the problem mention in the previous PR by @super-dainiu in #1880
# What’s New? In this PR, I implement the metainfo generator for pooling operations, including `AdaptiveAvgPool` and `MaxPool`. Also I found one interesting point during aligning the estimated memory cost...
# What’s New? In this PR, I done some work to support `torch.nn.functional.linear` in our metainfo generation, the memory estimation results are aligned with `torch.nn.Linear` (without bias). And though the...
# What's New? In this PR, I refactor the forward memory calculation of former patched operations. After this PR, in SPMD solver, we will use the conservative way to estimate...
# What's New? In this PR, I add binary elementwise metainfo for auto parallel. I also fix some annotations. NOTE: Currently this metainfo is only a rough estimation for binary...
# What's New? In this PR 1. I patch `F.conv` metainfo for auto parallel and slightly modify the data retrieval process in metainfo generation. 2. Fix some small bugs in...
## 📌 Checklist before creating the PR - [x] I have created an issue for this PR for traceability - [x] The title follows the standard format: `[doc/gemini/tensor/...]: A concise...
## 📌 Checklist before creating the PR - [x] I have created an issue for this PR for traceability - [x] The title follows the standard format: `[doc/gemini/tensor/...]: A concise...