piaohe20221128
Results
2
issues of
piaohe20221128
hi,thankyou for release code! I have a question about the different pipline between train and inference 。the paper says that in inference stage the predict out of every decoder layer...
请问如果多分支内部有激活函数、门控等非线性操作的话也可以合并吗?提前感谢解答!