ktransformers
ktransformers copied to clipboard
AMX only? re: KTransformers+SGLang Inference Deployment
Hi is the KTransformers+SGLang Inference Deployment only supported for AMX CPUs? Or can we try this with Epyc as well?
Yes, we can try AMD's CPU with AVX512 support right now. And the see PR:#1600 will enhance the performance for AMD int8 (but haven't finished yet, you can have a try if you want). And we also support AMD with the llamafile implementation. see the Q&A's link for the kt-kernel README #1608
Do Zen4 cpus of avx2 have any chance?
Haven't tested. After the PR is merged, you can give it a try to see the performance for int8 prefill.