MatchingNetworks
MatchingNetworks copied to clipboard
Clarification about implementation
Sorry to use the bug tracker for this, it's actually more of a question. How did you interpret the concatenation of the hidden state and the readout in equation 3 of the paper? It seems to me the state has twice the required shape after the concatenation, how is one supposed to manage that?
Your initial state should have twice the amount of zeros. Then you can easily concatenate and have the expected size.
It appears I have to change my implementation a bit. I just realized a minor difference, which shouldn't affect results too much. That's only in the full context embeddings case.
Actually, scratch what I said before. In practice it's not working as intended since the size will keep increasing. I implemented the concatenation with a summation for now.
Thanks for looking into it. I think the paper is lacking some details to do a faithfull reimplementation.
For what it's worth, the paper H. Altae-Tran, B. Ramsundar, A. S. Pappu, and V. Pande, “Low Data Drug Discovery with One-Shot Learning,” ACS central science, vol. 3, no. 4, pp. 283–293, 2017. seems to have an interpretation of how f should work which I think makes sense. They propose a refined version but I imagine the vanilla matching network would have an equation 3 like:
Basically, the hidden state/output of the LSTM is an additive correction over the original input vector (as implied by eq. 4)
I have put an implementation of this method here if you want to try it out. I haven't run it on Omniglot but on my data the fully conditional embedding has not benefit whatsoever.
Sorry to bump this up but have you had any time to look into this?