mnistCRNN icon indicating copy to clipboard operation
mnistCRNN copied to clipboard

How I can modify your code to support Masking?

Open o0windseed0o opened this issue 8 years ago • 5 comments

Hi jame, I found your code https://github.com/jamesmf/mnistCRNN/blob/master/scripts/addMNISTrnn.py helpful on sequence tagging, in which we can add more complicated layers on each timestep. Currently, I am wondering how I can add a Masking layer if I have batches of variable length, say, each batch has no more than maxToAdd pics. A direct way is to pad the shorter batches with zero matrix so that the input shape to CNN can be fixed. However, I find that Masking can make sense only before the RNN layer but not the CNN layer.

Do you have any ideas how to masking the input layer, since without masking there would be a lot of computational cost and also might be side effect on the optimization, right?

o0windseed0o avatar Jun 16 '16 03:06 o0windseed0o

I haven't tried to add masking yet, but there have been a number of questions asked about it in the keras issues section.

As this issue remains open, I'm not sure if it's supported yet. I have only ever used the zero-padding technique.

jamesmf avatar Jun 16 '16 12:06 jamesmf

Thanks for your quick reply. I added the masking layer before the rnn layer and it compiles, and it seems that masking is well suited for the rnn layers but not the convolutional layers, since convolutional layers rely on the input image of a fixed shape. I will keep following through the issue.

o0windseed0o avatar Jun 16 '16 13:06 o0windseed0o

hi @o0laika0o, Would you mind describing and guiding me that how did you achieve masking. I would like to use timedisributed layers with variable length sequence of frame/video and want to figure out how to accomplish masking/padding?

Thanks

oakkas avatar Sep 15 '16 14:09 oakkas

Hi jame: What does the parameter "maxToAdd" mean?And how can I decide the size about this parameter?

Thanks

Mark0908 avatar Jan 25 '18 07:01 Mark0908

maxToAdd is the number of MNIST digits to add together in this example.

If you're repurposing this code, it would represent the time or sequence dimension. So if you're processing the frames of a video, it would be the frame count.

jamesmf avatar Jan 25 '18 14:01 jamesmf