Add comprehensive quantization guide for DeepSeek-Coder-V2

Open isaacmujuni opened this issue 3 months ago • 0 comments

Addresses Issue #79: How to quantize DeepSeek-Coder-V2 for VLLM inference
Provides detailed quantization methods for vLLM, SGLang, llama.cpp, and AutoGPTQ
Includes performance comparisons and memory requirements
Adds troubleshooting section for common issues
Updates README.md with reference to the quantization guide

This guide helps users efficiently deploy DeepSeek-Coder-V2 models with reduced memory usage while maintaining high code generation quality.

Sep 02 '25 17:09 isaacmujuni