image-similarity-deep-ranking
image-similarity-deep-ranking copied to clipboard
Attempting to run in grayscale mode
Hi @akarshzingade, I've been trying to convert the model to run only in grayscale mode however I keep running across an error stating
ValueError: Error when checking input: expected input_2 to have shape (224, 224, 1) but got array with shape (224, 224, 3)
Here's what I've done so far:
- [x] Changed all modes in ImageDataGeneratorCustom to be grayscale
- [x] Verified shape of X in ImageDataGeneratorCustom to be shape= (224,224,1)
- [x] Modified deep_rank_model first and second inputs to shape= (224,224,1)
The input images are in color, but I assume PIL converts it to grayscale thru img.convert('L')
I might have missed something? Thanks!
Or I'm being dumb realizing VGG was built and compiled only for 3 channels? πππ
I guess I'll just resort to converting images to grayscale literally. Then fill in inputs as RGB; this way I have uniform inputs. Can't wait to see the outcome.
Leaving this thread open for comments and thoughts. Kindly close if desired. π
Tweaked iterator to override load_img
to load images as grayscale. Then verified changes by printing shape of loaded data. Reverted the rest of the changes back to original. Currently training at the moment π
Training was completed. Although working, I'm not getting my desired results. I noticed color still was a heavy factor for prediction. currently looking for ways for the model to disregard the colors in the image
So, the model is relying on the color too much?
It seems so. I'm really going super slow at the moment. The sampling process easily exhausts my resources. I only have 16GB ram. I'm still researching the feasibility of VGG for texture recognition. If all else fail, I'll then open doors to changing the network to something very suitable with grayscale. Also, I'll have to deal with my resource constraints π I've made a few mods to the code; instead of processing the files before writing to file, I made it write to file directly, so that it wont be consuming much RAM. This was later still ineffective as during the indexing, it would take as much resources then after indexing, it just freezes βοΈ
I also wish to ask what's the most ideal number of image per class? I have more than 10 classes. each having a mix of 40~50 images. Not quite enough for classification I guess given the number of classes.
I think Inception is less sensitive to colour. You could try that.
I haven't found any article/paper that shows the colour sensitivity of VGG. The closest I have found is this: https://arxiv.org/pdf/1710.00756.pdf. They say: "Lowlevel features (e.g., relu1 1 layer in VGG19) are sensitive to color appearance and thus fail to match objects with semantic similarity but different colors, e.g., matching result of blue-to-dark sky image pair at the finest level (L = 1)"
50 per class is fine I think. It's the number of triplets per query image that matters. I would say 50 triplets per query image and positive image pair.
Interesting! I'll give that a shot after I exhaust myself on VGG. Thankfully Keras made things easier with the applied networks. Thanks Akarsh!
When you say "50 triplets per query image" does that mean increasing the num_pos_images and num_neg_images in tripletSampler.py?
such that parameters look like the following:
python tripletSampler.py --input_directory data --output_directory output --num_pos_images 10 --num_neg_images 40
using "--num_pos_images 10 --num_neg_images 40" will create 40*10 triplets per query. What I meant to say was 50 negative images per query image and positive image pair. But, this will create a lot of triplets based on your dataset. So, you would have to choose according to the resources available for you.
That's noted. Thanks Akarsh! I'll be updating soon.
:)
Any interesting updates, cxfire16? :)
@akarshzingade @cxfire16 From my experiment, the result of this model is sensitive to colour. I want to do object shape similarity regardless of different color. Your request is very similar to mineοΌ You have solved this problemοΌ
@akarshzingade @cxfire16 @longzeyilang i have implemented the model on street2shop dataset . Yes from my experience too the model is very sensitive to colour and less selective to shapes.
Hey guys! I believe I have exhausted the stock configuration matched with various data preprocessing steps like image thresholding, grayscaling, and etc.. It seems that we are in the same path about wanting to weigh other attributes not just the color. My next step would be to attempt to change the network used so instead of using VGG, I'd use inception perhaps. but I'll have to do further research on color sensitivities of other networks. Nevertheless, we'd never know unless we try things. Thanks! I'll keep posted and will be updating soon.
@cxfire16 @IAmAbdusKhan There are two things that @longzeyilang pointed out in another issue. I have missed out taking the max of loss and 0 for each triplet. and also, this code doesn't include the squared L2 regularisation ( I think it is squared L2, but need to confirm) as mentioned in the paper. You could try that.
Is inceptionV3 more resource hungry compared to VGG16? I immediately get OOMs when attempting to train. I ended up reducing my batch size to just 1 because reducing it in half would yield the old error we encountered before; IndexError: index n is out of bounds for axis y with size z
Nevertheless it worked and I was able to train but for quite a longer duration compared to having larger batch size. Interestingly, results are still the same: it is still color sensitive π¬
@cxfire Yes inception is a much deeper network as compared to VGG16 so using it should be more resource intensive. Btw shouldn't the batch size be always a multiple of 3 as per the implementation for proper learning .
On Tue, Jul 10, 2018 at 11:04 AM, cxfire16 [email protected] wrote:
Is inceptionV3 more resource hungry compared to VGG16? I immediately get OOMs when attempting to train. I ended up reducing my batch size to just 1 because reducing it in half would yield the old error we encountered before; IndexError: index n is out of bounds for axis y with size z
Nevertheless it worked and I was able to train but for quite a longer duration compared to having larger batch size. Interestingly, results are still the same: it is still color sensitive π¬
β You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/akarshzingade/image-similarity-deep-ranking/issues/13#issuecomment-403705680, or mute the thread https://github.com/notifications/unsubscribe-auth/AiDV13V2pKeFMdg06nDwR-ptEtAbW55gks5uFDz3gaJpZM4Us3Py .
@IAmAbdusKhan yes, it's always on multiples of 3 due to the code being
batch_size = 1
then
batch_size *= 3
I've parked this problem for now as I wish to try storing the means for faster image search. I'm leaving this thread open for discussions for as long as @akarshzingade allows it so. I might be opening new ones regarding my path with creating a light search function. Also, I'd soon be forking the repo for my implementation of the file first method of loading images for lower spec computers. Might as well do PRs if I find something worth suggesting. Thanks everyone!