Clay Mullis comments

Results 176 comments of


                                            Clay Mullis

Allowing for custom trained VQGAN during DALLE training

@lucidrains https://github.com/lucidrains/DALLE-pytorch/pull/193 If you can work from this then go for it - if you have a better implementation in mind let me know.

Allowing for custom trained VQGAN during DALLE training

Here's a sample tokenizer to work with - perhaps include if you think it's a good idea `wget https://www.dropbox.com/s/uie7is0dyuxqmk0/hg_bpe_cc12m.json` Not a permanent host fyi. @lucidrains

Allowing for custom trained VQGAN during DALLE training

> > Here's a sample tokenizer to work with - perhaps include if you think it's a good idea > > > > `wget https://www.dropbox.com/s/uie7is0dyuxqmk0/hg_bpe_cc12m.json` > > > > Not...

Inference with DeepSpeed

Looking at `train_dalle.py` provides some insights from @janEbert prior grokking of Deep Speed. First mistake I'm making here is loading the checkpoint like this: ```sh dalle.load_state_dict(weights) ``` which is apparently...

Inference with DeepSpeed

Okay - I did things the way they're meant to be done (i believe) @rom1504 @janEbert @mehdidc ``` if args.fp16: engine = deepspeed.init_inference(dalle, dtype=torch.half) engine = deepspeed.init_inference(dalle) # training for...

Inference with DeepSpeed

As always, apologies to Jan who I'm sure has already explained this issue ;) I'll admit to some amount of laziness with regard to doing the due diligence on all...

Inference with DeepSpeed

Thanks @richcmwang! I'll work on this later unless you wanna make the PR. @rom1504 The DeepSpeed docs do indeed claim faster inference with the inference engine. Not sure how though.

W&B fails to upload artifact with DeepSpeed stages 1-3

This does of course mean you won't be able to upload DeepSpeed checkpoints to W&B - which I guess is a bug in its own right. I personally would want...

Image generation with deepspeed --fp16

The VQGAN simply won't work in 16-bit precision unfortunately. Converting only the torch modules of dalle which aren't VQGAN, and then forcing autocasting to fp32 for the vqgan mitigates this...

Image generation with deepspeed --fp16

> Wow amazing! Is that really enough to make it work ? I've been missing that feature a lot while using deepspeed please test it! but i think so yes.