John (Boyuan) Yao

Results 31 issues of John (Boyuan) Yao

As our future automatic parallelization might need to offload the checkpoint input for memory saving, I 1. Replace the origin torch checkpoint function with colossal.utils.checkpoint, which has the inference for...

Run Build and Test

# What’s New? In this PR I implement the `MetaInfo` For `torch.nn.ReLU`, and modify the memory test

Run Build and Test

# What’s New? In this PR, I fix the problem mention in the previous PR by @super-dainiu in #1880

# What’s New? In this PR, I implement the metainfo generator for pooling operations, including `AdaptiveAvgPool` and `MaxPool`. Also I found one interesting point during aligning the estimated memory cost...

# What’s New? In this PR, I done some work to support `torch.nn.functional.linear` in our metainfo generation, the memory estimation results are aligned with `torch.nn.Linear` (without bias). And though the...

Run Build and Test

# What's New? In this PR, I refactor the forward memory calculation of former patched operations. After this PR, in SPMD solver, we will use the conservative way to estimate...

Run Build and Test

# What's New? In this PR, I add binary elementwise metainfo for auto parallel. I also fix some annotations. NOTE: Currently this metainfo is only a rough estimation for binary...

# What's New? In this PR 1. I patch `F.conv` metainfo for auto parallel and slightly modify the data retrieval process in metainfo generation. 2. Fix some small bugs in...

## 📌 Checklist before creating the PR - [x] I have created an issue for this PR for traceability - [x] The title follows the standard format: `[doc/gemini/tensor/...]: A concise...

Run Build and Test
auto-parallel

## 📌 Checklist before creating the PR - [x] I have created an issue for this PR for traceability - [x] The title follows the standard format: `[doc/gemini/tensor/...]: A concise...

Run Build and Test
auto-parallel