DPTNet [Question] Did you evaluate the performance gain of using the improved Transformer layer instead of the standard Transformer layer?

[Question] Did you evaluate the performance gain of using the improved Transformer layer instead of the standard Transformer layer?

Open Emrys365 opened this issue 1 year ago • 1 comments

Hi, I am curious about the importance of the proposed improved Transformer layer compared to the standard one (w/o the positional encoding). But I couldn't find the related information in the paper.

May 04 '23 18:05 Emrys365

I think I have the answer now. I tried to replace the RNN with a feedforward layer, and it seems to converge very slow.

May 04 '23 20:05 Emrys365

DPTNet DPTNet copied to clipboard

[Question] Did you evaluate the performance gain of using the improved Transformer layer instead of the standard Transformer layer?

DPTNet
DPTNet copied to clipboard