Jittor-MLP
Jittor-MLP copied to clipboard
question
Thank you for coding this! I want to know why this code is written like this and what the padding operation acts on. https://github.com/liuruiyang98/Jittor-MLP/blob/b86656b65cf5f18ba9eb760d1f7565ed95e7e96e/models_pytorch/morph_mlp.py#L23
Thanks for your interest.
As of now, MorphMLP is not yet open source officially. I implemented it based on Figure 3 and Figure 4 in the paper with my self-understanding.
As for the padding operation, the code in lines 51-54 is to handle the case where L is not divisible by H or W. This part of the detail is not specifically discussed in the paper and I added it myself.
I will keep an eye on the open source status of MorphMLP, and I will update it to the official version to ensure it correct.
P_l, P_r, P_t, P_b = (self.L - W % self.L) // 2, (self.L - W % self.L) - (self.L - W % self.L) // 2, (self.L - H % self.L) // 2, (self.L - H % self.L) - (self.L - H % self.L) // 2.
Thank you for your answer. My understanding of L is to flatten the input by row or column and then obtain the chunks with L size, don't you think?
We set chunk length to L and thus obtain
X_i ∈ R^{L×C}, where i ∈ {1, ..., HW/L}
.
Yes, you are right. When I read this article at that time, I felt very confused here. From Figure 3, L is the group size in the H and W directions, but from table 7 and the network configuration, how to achieve L=49 in stage 4? This can only be completely flattened the feature map (7x7).
Maybe the meaning of L can be clearly understood by representing the feature map as (B, HW, c) in MLPMixer instead of (B, C, H, W) in Figure 3. 😂