Recurrent-Independent-Mechanisms icon indicating copy to clipboard operation
Recurrent-Independent-Mechanisms copied to clipboard

questions about performance of sequential mnist experiment

Open ziyuwwang opened this issue 4 years ago • 2 comments

Hello, I ran the code with setting "num_units=6, k=4" and I just cannot reach the accuracy reported in readme.md. Could you provide exact hyper-parameters of the sequential mnist experiment? Another issue is that I noticed there is no embedding of input in sequential mnist experiment and input size is set to 1. Would it be ok with this situation? For example, a pixel with value 0 is indistinguishable from null input. Looking forward to your reply!

ziyuwwang avatar Aug 04 '20 07:08 ziyuwwang

Hi, could you share the results that you are getting currently? I agree that the pixels will not be distinguishable from the null input and in the official implementation the null vector is appended after using a linear transformation on the input first before appending the null vector. You can also try that, but I was able to get these results without using a transformation.

dido1998 avatar Aug 05 '20 10:08 dido1998

It's strange that I get a result about "0.77, 0.55, 0.31" for test resolutions 1616, 1919, 24*24 with exact the code you released. After I add an embedding layer to transform the input into vectors of 600 dimensions and change the learning rate to 0.0001, the result turns to be around "0.88, 0.70, 0.44" which is closed to the result you report in readme.md.

ziyuwwang avatar Aug 06 '20 08:08 ziyuwwang