image-analogies icon indicating copy to clipboard operation
image-analogies copied to clipboard

Image Analogies not Importing theano_backend from Keras Correctly

Open matthewbahr opened this issue 8 years ago • 20 comments

Here's the error output from running just a basic make image with image-analogies:

Using gpu device 0: GeForce GTX 970 (CNMeM is enabled with initial size: 75.0% of memory, cuDNN 4007) Traceback (most recent call last): File "make_image_analogy.py", line 17, in <module> args = image_analogy.argparser.parse_args() File "build\bdist.win-amd64\egg\image_analogy\argparser.py", line 101, in parse_args AttributeError: 'module' object has no attribute '_on_gpu'

I'm able to run a full 12 epoch keras 1.0.5 test on the theano backend without problems. I've tried adding a "--a-scale-mode match" which gets me past the strange module issue but it just crashes on the first pass with an attribute error of

Convolution2D has no attribute 'get_output'

Not really sure what is going on.

matthewbahr avatar Jul 26 '16 23:07 matthewbahr

I think I've seen this before...if I had to guess, theano isn't fully compiled properly. Are you running this inside a virtualenv?

sdierauf avatar Aug 03 '16 15:08 sdierauf

No, I've done the theano compilation on my own rather than through virtualenv.

When I use Keras to drive the theano backend it works just fine I think.

Which error were you seeing when you'd seen this before? The 1st or the second? I'm not convinced that they have the same root issue.

matthewbahr avatar Aug 09 '16 15:08 matthewbahr

I fixed this by just removing the bit of code since it was for cpu mode anyway.

qazxswedcxzaqws avatar Sep 10 '16 03:09 qazxswedcxzaqws

The only reason I'm going to all this effort to do this on my windows instead of my mac is because I want CUDA

matthewbahr avatar Sep 10 '16 03:09 matthewbahr

Open image_analogy\argparser.py in notepad++, go to line 101 and make it look like this capture

qazxswedcxzaqws avatar Sep 10 '16 03:09 qazxswedcxzaqws

There should be one or two more errors after this, tell me what they are when you get them, I had the same problems so I just made a couple rushed temporary fixes and it was up and running.

qazxswedcxzaqws avatar Sep 10 '16 04:09 qazxswedcxzaqws

I'll try that, thanks!

matthewbahr avatar Sep 10 '16 04:09 matthewbahr

qazxswedcxzaqws I'm getting a new error like you expected.

Here is everything I get from the call to the output:

A:\Dev\Deepdream>python image-analogies/build/scripts-2.7/make_image_analogy.py images/greatwave.jpg images/greatwaveprime.jpg images/me.jpg images/out/me
Using Theano backend.
DEBUG: nvcc STDOUT mod.cu
   Creating library C:/Users/Crowbahr/AppData/Local/Theano/compiledir_Windows-10-10.0.14393-Intel64_Family_6_Model_60_Stepping_3_GenuineIntel-2.7.12-64/tmprydv0j/265abc51f7c376c224983485238ff1a5.lib and object C:/Users/Crowbahr/AppData/Local/Theano/compiledir_Windows-10-10.0.14393-Intel64_Family_6_Model_60_Stepping_3_GenuineIntel-2.7.12-64/tmprydv0j/265abc51f7c376c224983485238ff1a5.exp

Using gpu device 0: GeForce GTX 970 (CNMeM is enabled with initial size: 75.0% of memory, cuDNN 4007)
Theano cuda without cuDNN detected. Forcing a-scale-mode to "match"
Using PatchMatch model
Scale factor 0.25 "A" shape (1L, 3L, 864L, 1296L) "B" shape (1L, 3L, 864L, 1296L)
Building loss...
Precomputing static features...
Traceback (most recent call last):
  File "image-analogies/build/scripts-2.7/make_image_analogy.py", line 27, in <module>
    image_analogy.main.main(args, model_class)
  File "build\bdist.win-amd64\egg\image_analogy\main.py", line 69, in main
  File "build\bdist.win-amd64\egg\image_analogy\models\nnf.py", line 17, in build
  File "build\bdist.win-amd64\egg\image_analogy\models\nnf.py", line 55, in build_loss
  File "build\bdist.win-amd64\egg\image_analogy\models\base.py", line 53, in precompute_static_features
  File "build\bdist.win-amd64\egg\image_analogy\models\base.py", line 61, in get_features
  File "build\bdist.win-amd64\egg\image_analogy\models\base.py", line 72, in get_layer_output
AttributeError: 'Convolution2D' object has no attribute 'get_output'

matthewbahr avatar Sep 10 '16 18:09 matthewbahr

Open image_analogy\models\base.py in notepad++ then go to line 72 and change it to this capture3

qazxswedcxzaqws avatar Sep 11 '16 01:09 qazxswedcxzaqws

Alllllll righty it's working so far!

matthewbahr avatar Sep 11 '16 01:09 matthewbahr

Well I'm getting output now on it but it doesn't seem to be utilizing much of my GPU. My CPU is maxing out and it's hitting 8gb or RAM but my GPU is idling and never reaches more than 1% load according to GPU-Z

This means the iterations are taking forever.

It does appear to be using all the memory though, hitting 3686MB

matthewbahr avatar Sep 11 '16 04:09 matthewbahr

That means the program is running in CPU mode, but from what you have posted it looks like it should be running in GPU mode. Mind posting what your output looks like when you run it now?

qazxswedcxzaqws avatar Sep 11 '16 05:09 qazxswedcxzaqws

Well I'm seeing the warning message that I wrote in while doing the first change saying Theano cuda without cuDNN detected.

DEBUG: nvcc STDOUT mod.cu
   Creating library C:/Users/Crowbahr/AppData/Local/Theano/compiledir_Windows-10-10.0.14393-Intel64_Family_6_Model_60_Stepping_3_GenuineIntel-2.7.12-64/tmpy5e1xl/265abc51f7c376c224983485238ff1a5.lib and object C:/Users/Crowbahr/AppData/Local/Theano/compiledir_Windows-10-10.0.14393-Intel64_Family_6_Model_60_Stepping_3_GenuineIntel-2.7.12-64/tmpy5e1xl/265abc51f7c376c224983485238ff1a5.exp

Using gpu device 0: GeForce GTX 970 (CNMeM is enabled with initial size: 75.0% of memory, cuDNN 4007)
Theano cuda without cuDNN detected. Forcing a-scale-mode to "match"
Using PatchMatch model

But part of reading that says that CNMeM and cuDNN are there... It looks like it's using the GPU memory but not the cores for processing.

matthewbahr avatar Sep 11 '16 05:09 matthewbahr

Just double check that the changes I made earlier are 100% identical and also that cudNN is installed correctly, since i'm running an identical version without any ("Theano cuda without cuDNN detected. Forcing a-scale-mode to "match"") messages.

qazxswedcxzaqws avatar Sep 11 '16 05:09 qazxswedcxzaqws

Might be something with cuDNN. The code is identical.

You're seeing cuDNN 4007 too?

matthewbahr avatar Sep 11 '16 05:09 matthewbahr

Actually I just found that some options I was using made the "theano cuda without cuDNN" message disappear, with a similar usage to yours I still get the message. But my GPU is being used properly still nonetheless, so its not really an issue for me. I'm running CUDA 8.0 and cuDNN 5005 since I have a Pascal GPU, but that shouldn't really be an issue as your CUDA version is probably more compatible than mine since it is older.

qazxswedcxzaqws avatar Sep 11 '16 05:09 qazxswedcxzaqws

kk looks like cuDNN is only available as 5xxx series now from NVIDIA so I'm gonna have to work on this. Might as well leave the cpu version running over night in the meantime.

matthewbahr avatar Sep 11 '16 05:09 matthewbahr

After getting and replacing the cuDNN (It's at 5103 now) it's showing the same low gpu load. Occasionally I'll see a 40~ish spike but mainly not running.

Memory usage is still high.

matthewbahr avatar Sep 11 '16 05:09 matthewbahr

Strange, all I can do now is recommend this guide https://github.com/titu1994/Neural-Style-Transfer/blob/master/Guide.md in the "Setting Up Theano for GPU (on Windows)" Section, just in case you are missing any dependencies. As there don't seem to be any more error codes i'm not really sure whats going wrong, the only thing I can chalk it up to is this program is quite flaky and outdated in comparison to some newer Theano based alternatives.

qazxswedcxzaqws avatar Sep 11 '16 05:09 qazxswedcxzaqws

Hey the initial issue looks like you were using keras >= 1.0 with this project which was originally only compatible with keras 0.3. I've upgraded this project to use keras >= 1.0 so that should fix the get_output, and _on_gpu errors.

The GPU usage issue can be a combination of things. The output you posted above

Using PatchMatch model

means the patch matching is done with a different algorithm on the CPU. Use the option --model=brute to run the brute-force patch matcher on the GPU.

There was another issue with some combination of keras/theano/whatever where the brute-force GPU patch-matching convolutions are "optimized" by theano to use the CPU, instead. I've added a fix to explicitly use the cuDNN operations.

awentzonline avatar Dec 04 '16 02:12 awentzonline