siamesenetwork-tensorflow icon indicating copy to clipboard operation
siamesenetwork-tensorflow copied to clipboard

Running on non-MNIST greyscale images

Open lillythomas opened this issue 8 years ago • 3 comments

Hi ardiya!

Thanks for putting together this repository. Could you tell me what would need to be done to run this on non-MNIST greyscale images?

Thanks so much in advance! Lilly

lillythomas avatar Nov 13 '17 06:11 lillythomas

Hi @lillythomas, to train on your own dataset, you'll need to change dataset.py, or https://github.com/ardiya/siamesenetwork-tensorflow/blob/master/dataset.py#L31-L48 in specific. next_batch function needs to return a list of pair images and list of similar/dissimilar label. You also need to change the proportion of the similar/dissimilar case. In my case, for every batch, I generate similar case for every digits(1-1, 2-2, 3-3, ...), then I generate the combination of dissimilar digits(1-2, 1-3, 1-4, ..., 2-3, 2-4, 2-5 ,...).

To change this from b/w image to RGB image, you'll need to change the dimension from (28,28,1) into (WIDTH, HEIGHT, 3) in: https://github.com/ardiya/siamesenetwork-tensorflow/blob/master/dataset.py#L20 https://github.com/ardiya/siamesenetwork-tensorflow/blob/master/train.py#L15 https://github.com/ardiya/siamesenetwork-tensorflow/blob/master/train.py#L19-L20

ardiya avatar Nov 13 '17 07:11 ardiya

Hi @ardiya thank you for responding. I believe that I have made the changes you suggested, but I am a bit confused as to what this line (https://github.com/ardiya/siamesenetwork-tensorflow/blob/f81f2bee37a150391e2f895391390f59c4f4f5f2/dataset.py#L37) refers to.

My data is 256,256,3 in shape and is distributed between two classes. As such, my code that function is currently:

def next_batch(self, batch_size):
         left = []
         right = []
         sim = []
         # genuine
         for i in range(2):
               n = 1
               l = []
               ch = choice(self.num_idx[i], n*2, replace=False)
               l = ch.tolist()
               left.append(self.to_img(l.pop()))
               right.append(self.to_img(l.pop()))
               sim.append([1])
         #impostor
         for i,j in combinations(range(2), 2):
                left.append(self.to_img(choice(self.num_idx[i])))
                right.append(self.to_img(choice(self.num_idx[j])))
                sim.append([0])
          print("left, right, sim: ",left, right, sim)

Does this look about right to you? With some other changes made, I am able to execute the training script but with this error returned:

InvalidArgumentError (see above for traceback): Nan in summary histogram for: conv3/weights_1 [[Node: conv3/weights_1 = HistogramSummary[T=DT_FLOAT, _device="/job:localhost/replica:0/task:0/cpu:0"](conv3/weights_1/tag, conv3/weights/read/_19)]]

I reduced the learning rate to 0.0001 and the momentum to 0.01

lillythomas avatar Nov 16 '17 18:11 lillythomas

Ah yeah sorry @lillythomas, at that time I want to make the ratio between positive and negative example to be 1:1. It includes repeating the positive example by 45 times which is nCk with n=10 and c =2 and repeating the negative samples by 10 times. But the training works even without that ratio thing. As I'm a bad programmer, I forgot to clear that messy code.

For your case with only 2 classes, it should be simpler. I would approach it using this way:

def next_batch(self, batch_size):
    left = []
    right = []
    sim = []
    for _ in range(batch_size):
        # genuine pair 0-0 & 1-1
        for i in range(2):
            ch = choice(self.num_idx[i], 2, replace=False)
            left.append(self.to_img(ch[0]))
            right.append(self.to_img(ch[1]))
            sim.append([1])
        #impostor pair 0-1 & 1-0
        left.append(self.to_img(choice(self.num_idx[0])))
        right.append(self.to_img(choice(self.num_idx[1])))
        sim.append([0])
        left.append(self.to_img(choice(self.num_idx[1])))
        right.append(self.to_img(choice(self.num_idx[0])))
        sim.append([0])
    print("left, right, sim: ",left, right, sim)

For InvalidArgumentError, I need more info:

  • What is the loss during training, is it going down?
  • Have you normalized the pixels? Usually, Tensorflow uses pixel in range 0-1, not 0-255.
  • You can also try modifying the weight initializer in https://github.com/ardiya/siamesenetwork-tensorflow/blob/master/model.py#L11-L34. My code uses Xavier initializer but you can try others such as tf.truncated_normal_initializer(stddev=0.01).

ardiya avatar Nov 16 '17 20:11 ardiya