MoEfication icon indicating copy to clipboard operation
MoEfication copied to clipboard

Results 3 MoEfication issues
Sort by recently updated
recently updated
newest added

下载hugging face的t5 base model ,运行adj.py报错 修改权重名字如下后依然报错:

In paper ``For the MoE layers, we set the number of experts N to 32 for MoE-Dropout and SSD. MoE-Dropout linearly increases the number of selected experts K from 6...

hello guys, I am wondering if you guys have any plan on releasing the script of Persimmon-8B sparsify and training? I saw only t5, bert and GPT in this repo.