DeepCTR-Torch
DeepCTR-Torch copied to clipboard
In the MOE method does expert have to learn and can the frozen model be used as an expert?
Describe the question(问题描述) Modeling Task Relationships in Multi-task Learning with Multi-gate Mixture-of-Experts
In the MOE method does expert have to learn and can the frozen model be used as an expert?like gpt3 bert
thanks you very much