LoFTR icon indicating copy to clipboard operation
LoFTR copied to clipboard

About the structure of encoder layer

Open wtyuan96 opened this issue 2 years ago • 0 comments

Thanks for your great work!

In supplementary material, instead of Post-LN layer, you propose an encoder layer variant which has fast convergence without extensive hyperparameter tuning. The difference of your layer and Post-LN layer is depicted in Figure 1 of supplementary material. I noticed that the layer you proposed is different from the well-known Pre-LN encoder layer which is introduced in "On Layer Normalization in the Transformer Architecture". The difference of Pre-LN layer and Post-LN layer is depicted in Figure 1 of this paper.

Have you tried the Pre-LN layer, and may I ask why you use the proposed layer instead of Pre-LN layer.

wtyuan96 avatar Apr 08 '22 09:04 wtyuan96