midlevel-reps icon indicating copy to clipboard operation
midlevel-reps copied to clipboard

Colorization features require grayscale inputs

Open ahwillia opened this issue 3 years ago • 1 comments

Thanks for this very clear and useful package.

When calling visualpriors.representation_transform(image_data, "colorization") I get an error (dimension mismatch) when I pass in an RGB color image.

I know I can get a grayscale image along the lines of 0.3 * red + 0.6 * green + 0.1 * blue, but I wanted to check to see what you used during training so it could matched as closely as possible. Perhaps representation_transform could take care of this preprocessing step for the user to take out the guesswork?

ahwillia avatar Jul 28 '20 01:07 ahwillia

I actually get an error when defining the model. There is a mismatch between the shape of their pre-trained weights of the first convolutional layer (64,1,7,7) with that of the architecture definition (64,3,7,7).

Should one adjust the architecture definition of the TaskonomyEncoder to have only one channel in the input conv layer, or repeat three times the weights of the first conv layer of checkpoints['state_dict'] along channel dimensions?

sacadena avatar Sep 06 '20 20:09 sacadena