midlevel-reps
midlevel-reps copied to clipboard
Colorization features require grayscale inputs
Thanks for this very clear and useful package.
When calling visualpriors.representation_transform(image_data, "colorization")
I get an error (dimension mismatch) when I pass in an RGB color image.
I know I can get a grayscale image along the lines of 0.3 * red + 0.6 * green + 0.1 * blue
, but I wanted to check to see what you used during training so it could matched as closely as possible. Perhaps representation_transform
could take care of this preprocessing step for the user to take out the guesswork?
I actually get an error when defining the model. There is a mismatch between the shape of their pre-trained weights of the first convolutional layer (64,1,7,7)
with that of the architecture definition (64,3,7,7)
.
Should one adjust the architecture definition of the TaskonomyEncoder
to have only one channel in the input conv layer, or repeat three times the weights of the first conv layer of checkpoints['state_dict']
along channel dimensions?