robvanvolt comments

Results 33 comments of


                                            robvanvolt

"adamw" optimizer + weight decay = poor generations

The default weight_decay is .0 anyway, isn't it?

Pretrained models

@lucidrains thank you for your quick response! Will it be possible on pretrained models then to generate images without a CUDA-GPU, e.g. only using an integrated Intel GPU? And "The...

I created a repository which sole purpose is to host / collect pretrained models: https://github.com/robvanvolt/DALLE-models Here everyone can make their models available, regardless of whether they were trained on a...

Share my installation of DeepSpeed

Unfortunately, the installation does not work with the latest Nvidia GPUS (30XX), and triton==1.0.0.dev20210329 got permanently deleted (https://github.com/ptillet/triton/issues/99)... edit.: a598fba0 (HEAD) [DOCS] Various improvements and typo fixes seems to work...

Reproducing DALL-E using DeepSpeed

Awesome! this got a little traction fast! :D I'm currently trying to get deepspeed with sparse attention running on a 3090rtx (should work on the A100 then if it succeeds)....

Kl Loss correction

Maybe we could write the Open-AI team and ask for a straight answer? Maybe they disclose the information within a secure two-person email conversation?

CI / Tests for DALLE-pytorch

This is really cool! Also the colab (see the WDS implementation, I tested it on your colab https://github.com/lucidrains/DALLE-pytorch/pull/280#issuecomment-860207682). But I think it might get a little flickery if we test...

CI / Tests for DALLE-pytorch

> > > > This is really cool! Also the colab (see the WDS implementation, I tested it on your colab [#280 (comment)](https://github.com/lucidrains/DALLE-pytorch/pull/280#issuecomment-860207682)). > > But I think it might...

CI / Tests for DALLE-pytorch

At first, before CI really gets implemented in a more elaborate way, someone could just copy and paste the output of the test.py into the pull request to show the...

CI / Tests for DALLE-pytorch

> Unless someone want to rent a GPU for CI (and I'm not sure that's a good use of resources), I think the reasonable thing to do is using GitHub...