WasmEdge icon indicating copy to clipboard operation
WasmEdge copied to clipboard

feat: Support `--tensor-split` in the ggml plugin

Open hydai opened this issue 1 year ago • 0 comments

Summary

When running a large MoE model, the large tensors should be split across into multiple GPUs. Especially when we have multiple different GPUs with various VRAM sizes, this feature helps.

Details

  • Support --tensor-split.

Appendix

No response

hydai avatar Feb 16 '24 08:02 hydai