pixelsnail-public icon indicating copy to clipboard operation
pixelsnail-public copied to clipboard

Hanging after make_template

Open joshim5 opened this issue 6 years ago • 4 comments

I am trying to train a vanilla CIFAR-10 Pixel-SNAIL model using the command given in the README.

In Tensorflow 1.9, the code is hanging at these lines:

model = tf.make_template('model', getattr(pxpp_models, args.model + "_spec"))
with tf.device('/gpu:0'):
gen_par = model(x_init, h_init, init=True,
dropout_p=args.dropout_p, **model_opt)

Is this a known issue that can be resolved? Which version of Tensorflow has been tested?

joshim5 avatar Sep 20 '18 18:09 joshim5

Tried tensorflow 1.2.1 and works fine. Its a tensorflow issue. See here https://github.com/CuriousAI/mean-teacher/issues/1

wlin12 avatar Dec 27 '18 18:12 wlin12

I cannot get it to run with version 1.2.1. I received the error:

ModuleNotFoundError: No module named 'tensorflow.contrib'

I guess it's from the import:

from tensorflow.contrib.framework.python.ops import add_arg_scope

which gets used as a decorator on a bunch of functions

I'm not sure how to overcome this.

wrrogers avatar Apr 22 '20 01:04 wrrogers

I've also run into this problem and tried almost everything suggested.

I have Ubuntu 18, which doesn't support CUDA version 8, which is needed for tensorflow 1.2.1. So swapping to older versions of Tensorflow doesn't work for me.

Does anyone have any sort of way to fix the issue with code? Otherwise I think this code will just become unusable in the future.

TWJubb avatar May 07 '20 15:05 TWJubb

I've also run into this problem and tried almost everything suggested.

I have Ubuntu 18, which doesn't support CUDA version 8, which is needed for tensorflow 1.2.1. So swapping to older versions of Tensorflow doesn't work for me.

Does anyone have any sort of way to fix the issue with code? Otherwise I think this code will just become unusable in the future.

I tried substituting the dense and conv2d functions from nn.py with those from the pixelCNN++ code

https://github.com/openai/pixel-cnn/blob/master/pixel_cnn_pp/nn.py#L160

This seems to have worked but I have no idea why as the two sets of functions are very similar.

I am using TensorFlow 1.15.2

TWJubb avatar May 07 '20 16:05 TWJubb