BEGAN-tensorflow icon indicating copy to clipboard operation
BEGAN-tensorflow copied to clipboard

ValueError on train with celebA

Open MichaelOVertolli opened this issue 7 years ago • 7 comments

I'm getting an error during training. I tried it both with a version of celebA I already had on hand and with the one output by download.py and I got the same error. I'm running the system with --use_gpu=False (in case that matters).

Thanks for your help!

Here's the output and stack trace:

`[*] MODEL dir: logs/celebA_0418_104602 [*] PARAM path: logs/celebA_0418_104602/params.json 0%| | 0/500000 [00:00<?, ?it/s][0/500000] Loss_D: 0.538686 Loss_G: 0.048095 measure: 0.7599, k_t: 0.0002 [*] Samples saved: logs/celebA_0418_104602/0_G.png

Traceback (most recent call last): File "main.py", line 43, in main(config) File "main.py", line 35, in main trainer.train() File "/home/mvertolli/BEGAN/trainer.py", line 158, in train self.autoencode(x_fixed, self.model_dir, idx=step, x_fake=x_fake) File "/home/mvertolli/BEGAN/trainer.py", line 263, in autoencode x = self.sess.run(self.AE_x, {self.x: img}) File "/home/mvertolli/virtualenvs/tensorflow/local/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 778, in run run_metadata_ptr) File "/home/mvertolli/virtualenvs/tensorflow/local/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 961, in _run % (np_val.shape, subfeed_t.name, str(subfeed_t.get_shape()))) ValueError: Cannot feed value of shape (16, 3, 64, 64) for Tensor u'ToFloat:0', which has shape '(16, 64, 64, 3)'`

MichaelOVertolli avatar Apr 18 '17 15:04 MichaelOVertolli

Hi,

In config.py, the data_format ist chosen based on whether you use GPU or not:

if config.use_gpu:
        data_format = 'NCHW'
    else:
        data_format = 'NHWC'

I do not know the reason behind this, to my knowledge, GPU or not should not change anything regarding the data_format. So to make it work, you could just change the version without GPU to 'NCHW'.

But it would be interesting to know if this is a bug or if there is a reason i do not see.

janericlenssen avatar Apr 19 '17 09:04 janericlenssen

@mrjel It's because of the performance. 'NCHW' is cuDNN default which can make the GPU calculation faster. Details can be found https://www.tensorflow.org/performance/performance_guide#use_nchw_image_data_format

@MichaelOVertolli I didn't tested for --use_gpu=False and that's why the error causes. I can fix that problem but I can pretty sure that you can't achieve whatever you want without gpu. You need GPU to train BEGAN in reasonable timeline (in hours or 1~2 days) or you'll need to wait weeks unless you're not training small dataset like MNIST. CelebA won't be trained without GPU in reasonable time.

carpedm20 avatar Apr 19 '17 10:04 carpedm20

I was actually just curious to see how slow it would be. However, I'll just set it up on my GPU. Thanks for all your help!

MichaelOVertolli avatar Apr 19 '17 15:04 MichaelOVertolli

@carpedm20 I suggest you should fix this problem. because my macbook don't support GPU too.

zaykl avatar Apr 26 '17 01:04 zaykl

@carpedm20 Any update on getting this to work on CPU? I am trying to run it on a MacBook and I don't have a GPU.

srb0203 avatar Jun 14 '17 20:06 srb0203

@srb0203 I suggest u buy a PC with nice GPU. That's what I have done.

zaykl avatar Jun 15 '17 06:06 zaykl

@carpedm20 thanks

zhangqianhui avatar Jun 29 '17 08:06 zhangqianhui