WasmEdge
WasmEdge copied to clipboard
feat: Support `--tensor-split` in the ggml plugin
Summary
When running a large MoE model, the large tensors should be split across into multiple GPUs. Especially when we have multiple different GPUs with various VRAM sizes, this feature helps.
Details
- Support
--tensor-split.
Appendix
No response