mistral.rs Need parallel linears

Need parallel linears

Open EricLBuehler opened this issue 10 months ago • 4 comments

Apr 01 '24 12:04 EricLBuehler

Is the plan to implement tensor sharding for both quantized and non-quantized versions?

Apr 03 '24 00:04 hugoabonizio

Yes, that is the plan.

Apr 03 '24 11:04 EricLBuehler

I'm quite interested in the parallelization in CUDA of the quantized models!

Apr 03 '24 12:04 hugoabonizio

Yes! We are beginning work on this topic now.

Apr 04 '24 14:04 EricLBuehler