FasterTransformer
FasterTransformer copied to clipboard
T5 MoE docs need updates
There is no description of T5 MoE Support in docs/t5_guide.md
; updates are needed; thanks!
Another question: are MoE Support in examples/pytorch/t5/translate_example.py
thoroughly tested? I found some suspicious bugs in it.
There is no description of T5 MoE Support in
docs/t5_guide.md
; updates are needed; thanks!Another question: are MoE Support in
examples/pytorch/t5/translate_example.py
thoroughly tested? I found some suspicious bugs in it.
We don't have public checkpoint to demo.
There is no description of T5 MoE Support in
docs/t5_guide.md
; updates are needed; thanks! Another question: are MoE Support inexamples/pytorch/t5/translate_example.py
thoroughly tested? I found some suspicious bugs in it.We don't have public checkpoint to demo.
I see, thank you. But it's very important for us users to follow this work. I'll appreciate it if you can provide some checkpoints and detailed docs about T5 Moe Support.
Does the kernels even work? I set random weights to set up an MoE T5, but I continuously get errors regarding internal errors in CUTLASS MoE GEMM kernel. Any thoughts?
Also, Switch Transformers is pretty much the MoE version of T5. The weights are publicly available everywhere, namely huggingface.
There is no description of T5 MoE Support in
docs/t5_guide.md
; updates are needed; thanks! Another question: are MoE Support inexamples/pytorch/t5/translate_example.py
thoroughly tested? I found some suspicious bugs in it.We don't have public checkpoint to demo.
I see, thank you. But it's very important for us users to follow this work. I'll appreciate it if you can provide some checkpoints and detailed docs about T5 Moe Support.
Hello: Has this problem been fixed? How can I use FT to support T5-MoE? please~
+1