FasterNet icon indicating copy to clipboard operation
FasterNet copied to clipboard

Does using GELU or RELU have a critical impact on the performance of the T0 model?

Open MenSanYan opened this issue 1 year ago • 1 comments

I noticed that you use GELU in small models like T0 and T1 and RELU in larger models like T2, is this intentional or just an oversight?

MenSanYan avatar Mar 22 '23 09:03 MenSanYan