zhouheyun

Results 2 issues of zhouheyun

I have a few questions about the inference efficiency of deepseek v2 1. > In order to efficiently deploy DeepSeek-V2 for service, we first convert its parameters into the precision...

Any plan to support BF16 inference? Our model encountered fp16 overflow after deployment.