Hongxin Liu

Results 70 comments of Hongxin Liu

As it just shows training process, we just use randomly initialized model for simplicity. You can use a pretrained model easily by `GPTActor(pretrained='gpt2')`.

Hi, these environment variables should be set carefully. We generally recommend that users can launch by [torchrun](https://pytorch.org/docs/stable/elastic/run.html?highlight=torchrun)

Hi, old version of ZeRO will be be deprecated in the future. According to our benchmark results, new version of ZeRO is better than the old one. Could you tell...

> I think it might be hard to satisfy torch.all_close condition. checking the norm of weights might be an alternative. We will fix the random seed before each OP. All...

Torch's version in CI is 1.11, which is incompatible with meta tensor. I run test on local machine: ![image](https://user-images.githubusercontent.com/23111350/225803427-b8e05923-1b6b-4681-8b65-27f6d9b20b48.png)

Thanks for your suggestion. We will make efforts to implement this feature in this month.

Could you add unit tests to check this feature?

> > Could you add unit tests to check this feature? > > I've tested in my local experiments ![image](https://private-user-images.githubusercontent.com/1772912/301859323-600a49e1-4011-4ca5-aaa5-48a1e293b34e.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MDcyMDYzMjQsIm5iZiI6MTcwNzIwNjAyNCwicGF0aCI6Ii8xNzcyOTEyLzMwMTg1OTMyMy02MDBhNDllMS00MDExLTRjYTUtYWFhNS00OGExZTI5M2IzNGUucG5nP1gtQW16LUFsZ29yaXRobT1BV1M0LUhNQUMtU0hBMjU2JlgtQW16LUNyZWRlbnRpYWw9QUtJQVZDT0RZTFNBNTNQUUs0WkElMkYyMDI0MDIwNiUyRnVzLWVhc3QtMSUyRnMzJTJGYXdzNF9yZXF1ZXN0JlgtQW16LURhdGU9MjAyNDAyMDZUMDc1MzQ0WiZYLUFtei1FeHBpcmVzPTMwMCZYLUFtei1TaWduYXR1cmU9NDliNWFkNTdjMTcxODUwMTVkYzY1NjJiYzdiMDBiMDRjZmFiZjgzMDA4YzkwZTU1NDE4Mjc1ZWNmYzkxMzE4MyZYLUFtei1TaWduZWRIZWFkZXJzPWhvc3QmYWN0b3JfaWQ9MCZrZXlfaWQ9MCZyZXBvX2lkPTAifQ.ox05Y2u3iBzzCtel-loFlm1-WDRz-ui6tzv4T0th7VA) > > But how to test it in unittest: colossalai/tests...