PART
PART copied to clipboard
positonal encoding
Hello, thank you for your great work. I am studying your code in transformer.py, you just use encoder_layers and it uses normalize_before decide whether to use position encoding. You set normalize_before as False all the time. So, do you use positional encoding? If I understand wrong, please let me know, Thank you.
Thanks for pointing out this issue. In our released code, we found that positional encoding affects the final performance insignificantly (about 0.5%) but would lead to instability during training and across other datasets. We set false in this released version, the original codes with True are commented as you point out. Considering your problems, we roll back this setting to make it consistent with the paper.
Thanks again for the detailed issues on our work. If any other problems, feel free to email me.