LoFTR
LoFTR copied to clipboard
About the structure of encoder layer
Thanks for your great work!
In supplementary material, instead of Post-LN layer, you propose an encoder layer variant which has fast convergence without extensive hyperparameter tuning. The difference of your layer and Post-LN layer is depicted in Figure 1 of supplementary material. I noticed that the layer you proposed is different from the well-known Pre-LN encoder layer which is introduced in "On Layer Normalization in the Transformer Architecture". The difference of Pre-LN layer and Post-LN layer is depicted in Figure 1 of this paper.
Have you tried the Pre-LN layer, and may I ask why you use the proposed layer instead of Pre-LN layer.