Thomas Johnson

Results 8 comments of Thomas Johnson

Also getting this frequently on 0.1.8, with Ubuntu 22.04 on Intel

Happens basically every time I fire up a ton of jobs at once, but I don't have a small test case that reproduces

I haven't tried with the standard python driver. I filed the bug a while ago, so unfortunately I'm not using the exact setup I had back then. (I'm not complaining...

Thanks, so just to be clear for the 790M model you do linear scheduling up to 5 * 2.5e-4 (the 2.5e-4 is from GPT-3 Large 760M) over the first 10%...

Thanks, I'll check out Kaggle

It seems the Kaggle notebook environment is also somewhat broken. I have no idea if this is an xla issue, a Jax issue, or something else, but here's the error:...

Yes, it's the TPU VM environment that Kaggle calls "TPU VM v3-8" Here's an example notebook: https://www.kaggle.com/code/tjohnson/notebookbf52281afd

Thanks so much for looking into this @will-cromar ! I'll switch the tensorflow versions as recommended. I noticed that the docker-python PR is failing CI, although I don't have permissions...