ERNIE
ERNIE copied to clipboard
如果通用模型持续预训练,下游任务的模型层需要重新简单训练吗?
看论文的时候想到的,如果不需要的话,是为什么呢?任务层的效果变化幅度怎么样?下游任务可能会变得更差吗?
如果需要的话,重新一轮会需要多久呢?
下游任务进行微调要好于直接zero-shot的方式
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Feel free to reopen it. Thank you for your contributions.