Ming Lin issues

Repositories
Issues
Comments

Results 2 issues of


                                            Ming Lin

D+S packing in vLLM seems buggy

Hello! I followed D+S packing instruction and stored the packed .pt file in "~/models/${model_name}-squeezellm/packed_weight", where model_name="Llama-2-7b-chat-hf". When I load this model in vLLM: ``` python examples/llm_engine_example.py --dtype float16 --model ~/models/${model_name}-squeezellm/packed_weight...

sample_weight is negative when running kmeans clustering

This is a really nice work! I followed the instruction to quantize Llama-2-7b-chat-hf. At kmeans clustering step, I ran the following command: ``` python nuq.py --bit 4 --model_type llama --model...