BigGAN-TPU-TensorFlow icon indicating copy to clipboard operation
BigGAN-TPU-TensorFlow copied to clipboard

After running 300K on TPU, we didn't achieve satisfactory results.

Open AaronAnima opened this issue 5 years ago • 6 comments

We run **'launch_train_tpu_sagan.sh' on V3 pod for 300k steps, but it seems to collapse from 10k. And the generation quality is not good, i wonder wether you've managed to achieve or at least close to biggan's performance? Here're some results: After 100k 100k samples After 180k 180k samples After 300k 300k samples

AaronAnima avatar Oct 16 '19 03:10 AaronAnima

I got the same problem, it seems this code is not useful ..

zsdonghao avatar Oct 16 '19 03:10 zsdonghao

I didn't use 'comet' module, and keep 'launch_train_tpu_sagan.sh' almost the same( just do some small changes regarding to my local environment, those important configs such as bs, lr... remain the same)

AaronAnima avatar Oct 16 '19 03:10 AaronAnima

Hi all,

The code didn’t get finished, hence you’re not seeing good results. Feel free to submit PRs to improve it :)

On October 15, 2019 at 8:35:13 PM, Mingdong Wu ([email protected]) wrote:

I didn't use 'comet' module, and keep 'launch_train_tpu_sagan.sh' almost the same( just do some small changes regarding to my local environment, those important configs such as bs, lr... remain the same)

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/Octavian-ai/BigGAN-TPU-TensorFlow/issues/6?email_source=notifications&email_token=ADSQKIUCGBXNW4UPW3IPUPTQO2DXDA5CNFSM4JBFUODKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEBK5OIY#issuecomment-542496547, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADSQKIU767RQAHMW7YNEYV3QO2DXDANCNFSM4JBFUODA .

davidhughhenrymack avatar Oct 16 '19 13:10 davidhughhenrymack

We also can report failure; after swapping the initialization to tf.orthogonal_initializer(1.0) to make it more BigGAN-like, a 6h run on a TPU pod (~12k/s or >4.3m total) produced only blobs like attached:

test96-96-64-301

gwern avatar Feb 06 '20 22:02 gwern

this project just does not work

zsdonghao avatar Feb 07 '20 15:02 zsdonghao

Well, we know that now. Unfortunately, the README doesn't mention that at all (a minor omission, to be sure).

gwern avatar Feb 07 '20 15:02 gwern