Liger-Kernel
Liger-Kernel copied to clipboard
Moe kernel for latest transformers v5
🐛 Describe the bug
experts will no longer be just a ModuleList of MLP layers. It's time to write moe kernel for moe layers!
https://github.com/huggingface/transformers/pull/41580/files#diff-0855b77fc27ad9449158a1c74953f909b011c00de7125f7c8e68d0ff209c092aR218
Reproduce
No response
Versions
transformers v5