sugunav14
sugunav14
> @sugunav14, what's the issue with fp8 weight loading? DeepseekV3 weights are in fp8 on huggingface. Since we have the load_state_dict() patch in place now it loads the weights in...
Merged in this [MR](https://github.com/nv-auto-deploy/TensorRT-LLM/pull/10)
num_heads_q is 32 and num_heads_kv is 8
Also, another observation I made is that I am facing this error only when I set fuse_rope = True