Thank you for your great works. We have released the 4-bit GPTQ quantized LLaDA model on Hugging Face:
Based on the published evaluation code, we have evaluated the quantized base model. The results are as follows:
| Dataset |
GPTQ-4bit |
FP16 |
| MMLU |
65.20 |
65.90 |
| CMMLU |
69.23 |
69.90 |
| ARC-Challenge |
45.48 |
47.90 |
I am so sorry for the late response. Thank you for your excellent work! May I ask what is the accuracy of quantized LLaDA model on GSM8K?
Hi,
Firstly, thanks for the great work!
I am trying to replicate these results with the quantization code.
Is the environment setup any different from the eval.sh in the original LLaDA code? Also I'm getting an error that says llada isn't supported yet.
[rank3]: Traceback (most recent call last):
[rank3]: File "/nvme-data2/atharvchagi/LLaDA/quantization/eval_llada.py", line 580, in
[rank3]: cli_evaluate()
[rank3]: File "/data/atharvchagi/miniforge/envs/llm-inf/lib/python3.10/site-packages/lm_eval/main.py", line 389, in cli_evaluate
[rank3]: results = evaluator.simple_evaluate(
[rank3]: File "/data/atharvchagi/miniforge/envs/llm-inf/lib/python3.10/site-packages/lm_eval/utils.py", line 422, in _wrapper
[rank3]: return fn(*args, **kwargs)
[rank3]: File "/data/atharvchagi/miniforge/envs/llm-inf/lib/python3.10/site-packages/lm_eval/evaluator.py", line 209, in simple_evaluate
[rank3]: lm = lm_eval.api.registry.get_model(model).create_from_arg_string(
[rank3]: File "/data/atharvchagi/miniforge/envs/llm-inf/lib/python3.10/site-packages/lm_eval/api/model.py", line 151, in create_from_arg_string
[rank3]: return cls(**args, **args2)
[rank3]: File "/nvme-data2/atharvchagi/LLaDA/quantization/eval_llada.py", line 278, in init
[rank3]: self.model = GPTQModel.load(model_path, device='cuda' , trust_remote_code=True )
[rank3]: File "/data/atharvchagi/miniforge/envs/llm-inf/lib/python3.10/site-packages/gptqmodel/models/auto.py", line 293, in load
[rank3]: m = cls.from_quantized(
[rank3]: File "/data/atharvchagi/miniforge/envs/llm-inf/lib/python3.10/site-packages/gptqmodel/models/auto.py", line 365, in from_quantized
[rank3]: model_type = check_and_get_model_type(model_id_or_path, trust_remote_code)
[rank3]: File "/data/atharvchagi/miniforge/envs/llm-inf/lib/python3.10/site-packages/gptqmodel/models/auto.py", line 239, in check_and_get_model_type
[rank3]: raise TypeError(f"{config.model_type} isn't supported yet.")
[rank3]: TypeError: llada isn't supported yet.