Running on non-MNIST greyscale images
Hi ardiya!
Thanks for putting together this repository. Could you tell me what would need to be done to run this on non-MNIST greyscale images?
Thanks so much in advance! Lilly
Hi @lillythomas, to train on your own dataset, you'll need to change dataset.py, or https://github.com/ardiya/siamesenetwork-tensorflow/blob/master/dataset.py#L31-L48 in specific. next_batch function needs to return a list of pair images and list of similar/dissimilar label. You also need to change the proportion of the similar/dissimilar case. In my case, for every batch, I generate similar case for every digits(1-1, 2-2, 3-3, ...), then I generate the combination of dissimilar digits(1-2, 1-3, 1-4, ..., 2-3, 2-4, 2-5 ,...).
To change this from b/w image to RGB image, you'll need to change the dimension from (28,28,1) into (WIDTH, HEIGHT, 3) in: https://github.com/ardiya/siamesenetwork-tensorflow/blob/master/dataset.py#L20 https://github.com/ardiya/siamesenetwork-tensorflow/blob/master/train.py#L15 https://github.com/ardiya/siamesenetwork-tensorflow/blob/master/train.py#L19-L20
Hi @ardiya thank you for responding. I believe that I have made the changes you suggested, but I am a bit confused as to what this line (https://github.com/ardiya/siamesenetwork-tensorflow/blob/f81f2bee37a150391e2f895391390f59c4f4f5f2/dataset.py#L37) refers to.
My data is 256,256,3 in shape and is distributed between two classes. As such, my code that function is currently:
def next_batch(self, batch_size):
left = []
right = []
sim = []
# genuine
for i in range(2):
n = 1
l = []
ch = choice(self.num_idx[i], n*2, replace=False)
l = ch.tolist()
left.append(self.to_img(l.pop()))
right.append(self.to_img(l.pop()))
sim.append([1])
#impostor
for i,j in combinations(range(2), 2):
left.append(self.to_img(choice(self.num_idx[i])))
right.append(self.to_img(choice(self.num_idx[j])))
sim.append([0])
print("left, right, sim: ",left, right, sim)
Does this look about right to you? With some other changes made, I am able to execute the training script but with this error returned:
InvalidArgumentError (see above for traceback): Nan in summary histogram for: conv3/weights_1 [[Node: conv3/weights_1 = HistogramSummary[T=DT_FLOAT, _device="/job:localhost/replica:0/task:0/cpu:0"](conv3/weights_1/tag, conv3/weights/read/_19)]]
I reduced the learning rate to 0.0001 and the momentum to 0.01
Ah yeah sorry @lillythomas, at that time I want to make the ratio between positive and negative example to be 1:1. It includes repeating the positive example by 45 times which is nCk with n=10 and c =2 and repeating the negative samples by 10 times. But the training works even without that ratio thing. As I'm a bad programmer, I forgot to clear that messy code.
For your case with only 2 classes, it should be simpler. I would approach it using this way:
def next_batch(self, batch_size):
left = []
right = []
sim = []
for _ in range(batch_size):
# genuine pair 0-0 & 1-1
for i in range(2):
ch = choice(self.num_idx[i], 2, replace=False)
left.append(self.to_img(ch[0]))
right.append(self.to_img(ch[1]))
sim.append([1])
#impostor pair 0-1 & 1-0
left.append(self.to_img(choice(self.num_idx[0])))
right.append(self.to_img(choice(self.num_idx[1])))
sim.append([0])
left.append(self.to_img(choice(self.num_idx[1])))
right.append(self.to_img(choice(self.num_idx[0])))
sim.append([0])
print("left, right, sim: ",left, right, sim)
For InvalidArgumentError, I need more info:
- What is the loss during training, is it going down?
- Have you normalized the pixels? Usually, Tensorflow uses pixel in range 0-1, not 0-255.
- You can also try modifying the weight initializer in https://github.com/ardiya/siamesenetwork-tensorflow/blob/master/model.py#L11-L34. My code uses Xavier initializer but you can try others such as tf.truncated_normal_initializer(stddev=0.01).