Recurrent-Independent-Mechanisms questions about performance of sequential mnist experiment

questions about performance of sequential mnist experiment

Open ziyuwwang opened this issue 4 years ago • 2 comments

Hello, I ran the code with setting "num_units=6, k=4" and I just cannot reach the accuracy reported in readme.md. Could you provide exact hyper-parameters of the sequential mnist experiment? Another issue is that I noticed there is no embedding of input in sequential mnist experiment and input size is set to 1. Would it be ok with this situation? For example, a pixel with value 0 is indistinguishable from null input. Looking forward to your reply！

Aug 04 '20 07:08 ziyuwwang

Hi, could you share the results that you are getting currently? I agree that the pixels will not be distinguishable from the null input and in the official implementation the null vector is appended after using a linear transformation on the input first before appending the null vector. You can also try that, but I was able to get these results without using a transformation.

Aug 05 '20 10:08 dido1998

It's strange that I get a result about "0.77, 0.55, 0.31" for test resolutions 1616, 1919, 24*24 with exact the code you released. After I add an embedding layer to transform the input into vectors of 600 dimensions and change the learning rate to 0.0001, the result turns to be around "0.88, 0.70, 0.44" which is closed to the result you report in readme.md.

Aug 06 '20 08:08 ziyuwwang

Recurrent-Independent-Mechanisms Recurrent-Independent-Mechanisms copied to clipboard

questions about performance of sequential mnist experiment

Recurrent-Independent-Mechanisms
Recurrent-Independent-Mechanisms copied to clipboard