Katherine Crowson
Katherine Crowson
Apparently if I tell ESGD-M to do a Hessian-vector product *every step* instead of every ten for compute efficiency, I don't OOM anymore. Normally the graphs made with create_graph=True are...
I would also like some clarity on the best KL weight for training from scratch (and whether it should be warmed up over time).
> @borisdayma I personally don't think so. In the image reconstruction example from `usage.ipynb`, the discretion method of DALL-E is the `argmax` function... Here's one thought, if we keep every...
> Hi @TomoshibiAkira , it is really a valuable discussion! May I know if you validate the performance of f=8 without Gumbel? Actually, I just want to see the effect...
You should be using ImageNet statistics for any input because that's what VGG-16 was trained on, you should only use different statistics if you trained or fine-tuned VGG-16 on a...
> I accidentally wiped my google drive. Also a bit busy lately so going to take a while to regenerate these I still have the pretrained model downloaded, if you...
It looks like L-BFGS took a bad step and was unable to recover. Unfortunately my L-BFGS implementation does not include a line search to guard against and reject bad steps....
Hi. I tried this and I'm not able to upload content/style images: I haven't used Colab before and don't really know how to start troubleshooting the issue.
Apparently the solution to the problem I encountered is to use Chrome instead of Safari.
The auxiliary image allows you to specify an image which the rendering process is "drawn back to" during iteration. Technically it imposes an L2 penalty on the difference between the...