dsnt icon indicating copy to clipboard operation
dsnt copied to clipboard

Make dsnt() and _normalise_heatmap() accept multi channels tensor

Open offchan42 opened this issue 5 years ago • 9 comments

Now both functions will accept 4d tensor of shape [batch_size, height, width, channels] instead of [batch_size, height, width, 1]. It means that the user will be able to predict more than one x,y coordinates by feeding a tensor with multiple activation maps. The output will be of shape [batch_size, channels, 2] where channels is the number of output coordinates.

Also, allow the user to choose the output range between -1 to 1 and 0 to 1.

offchan42 avatar Oct 01 '19 09:10 offchan42

I have tested the code with dummy 2 circle data and predict their positions and it works. image The red and green dots are the predictions of the model. The red dot is supposed to be inside a bigger circle and green dot inside the smaller circle. I obtained subpixel accuracy using following model:

from tensorflow import keras as kr
model = kr.Sequential([
    kr.layers.Conv2D(16, 5, strides=1, padding='same', activation='relu', input_shape=(x_train.shape[1:])),
    kr.layers.Dropout(0.25),
    kr.layers.Conv2D(16, 5, strides=1, padding='same', activation='relu'),
    kr.layers.Dropout(0.25),
    kr.layers.Conv2D(16, 5, strides=1, padding='same', activation='relu'),
    kr.layers.Dropout(0.25),
    kr.layers.Conv2D(16, 5, strides=1, padding='same', activation='relu'),
    kr.layers.Dropout(0.25),
    kr.layers.Conv2D(2, 5, padding='same'),
    kr.layers.Lambda(lambda x: dsnt.dsnt(x, 'softmax')[1]),
])
model.compile(kr.optimizers.Adam(0.001), loss='mse', metrics=['mae'])

Dataset: 5,000x32x32x1 training set image, 5,000x2x2 training label

offchan42 avatar Oct 01 '19 09:10 offchan42

But I'm not sure whether js_reg_loss will work with multi-channel though because I haven't tried it yet. As it's not trivial for me to use a custom loss of this type in Keras. So if you could give an insight, it would be great.

offchan42 avatar Oct 01 '19 10:10 offchan42

is there any updates on this branch? interested in reviewing cus I also need this feature

ysyyork avatar Nov 21 '19 22:11 ysyyork

You can use it fine. Except there is no regularization loss yet. Because I'm not familiar with it. So don't use the regularization like js_reg_loss. But I'm sure the multi-channel feature is working.

offchan42 avatar Nov 22 '19 07:11 offchan42

great thx!

ysyyork avatar Nov 22 '19 16:11 ysyyork

I have tested the code with dummy 2 circle data and predict their positions and it works. image The red and green dots are the predictions of the model. The red dot is supposed to be inside a bigger circle and green dot inside the smaller circle. I obtained subpixel accuracy using following model:

from tensorflow import keras as kr
model = kr.Sequential([
    kr.layers.Conv2D(16, 5, strides=1, padding='same', activation='relu', input_shape=(x_train.shape[1:])),
    kr.layers.Dropout(0.25),
    kr.layers.Conv2D(16, 5, strides=1, padding='same', activation='relu'),
    kr.layers.Dropout(0.25),
    kr.layers.Conv2D(16, 5, strides=1, padding='same', activation='relu'),
    kr.layers.Dropout(0.25),
    kr.layers.Conv2D(16, 5, strides=1, padding='same', activation='relu'),
    kr.layers.Dropout(0.25),
    kr.layers.Conv2D(2, 5, padding='same'),
    kr.layers.Lambda(lambda x: dsnt.dsnt(x, 'softmax')[1]),
])
model.compile(kr.optimizers.Adam(0.001), loss='mse', metrics=['mae'])

Dataset: 5,000x32x32x1 training set image, 5,000x2x2 training label

I found this example model really useful. I think it would make sense to add the code you used to generate that artificial dataset together with the code to fit this model.

hjpulkki avatar Apr 15 '20 05:04 hjpulkki

@hjpulkki Dataset can be created simply by creating a black image, then draw a circle using cv2.circle() function at a random location on the image. Use that random location as Y to train the model. The random location must be divided by the image size to have values ranging from 0 to 1. The training code is just fit for few epochs. The hyperparameters and learning rate are already shown in the model.compile above. I've already lost the original code.

offchan42 avatar Apr 15 '20 14:04 offchan42

it works, thanks

guker avatar Aug 07 '20 13:08 guker

You can use it fine. Except there is no regularization loss yet. Because I'm not familiar with it. So don't use the regularization like js_reg_loss. But I'm sure the multi-channel feature is working.

I merged the channels with the batches. What do you think of this solution?

def js_reg_loss(heatmaps, centres, fwhm=1):
    '''
    Calculates and returns the average Jensen-Shannon divergence between heatmaps and target Gaussians.
    Arguments:
        heatmaps - Heatmaps generated by the model
        centres - Centres of the target Gaussians (in normalized units)
        fwhm - Full-width-half-maximum for the drawn Gaussians, which can be thought of as a radius.
    '''
    batch, h, b, channels = heatmaps.shape
    heatmaps_transposed = tf.transpose(heatmaps, axes=[0, -1, 1, 2])
    heatmaps_reshape = tf.reshape(heatmaps_transposed, (batch * channels, h, b))
    centres_reshape = tf.reshape(centres, (batch * channels, 2))
    gauss = _make_gaussians(centres_reshape, tf.shape(heatmaps_reshape)[1], tf.shape(heatmaps_reshape)[2], fwhm)
    divergences = _js_2d(heatmaps_reshape, gauss)
    return tf.reduce_mean(divergences)

kbamps avatar Jun 23 '21 17:06 kbamps