GPTQModel mistral3

can someone reply here when/if mistral3 support is added (not sure how ISTA-DASLab/Mistral-Small-3.1-24B-Instruct-2503-GPTQ-4b-128g was made)

Apr 27 '25 07:04 ewof

in the meantime if anyone else is looking for a gptq mistral small jeffcookio/Mistral-Small-3.1-24B-Instruct-2503-HF-gptqmodel-4b-128g worked for me

Apr 27 '25 22:04 ewof

@ewof What error do you get with Mistral3?

Apr 29 '25 01:04 Qubitium

INFO  ENV: Auto setting PYTORCH_CUDA_ALLOC_CONF='expandable_segments:True' for memory saving.
INFO  ENV: Auto setting CUDA_DEVICE_ORDER=PCI_BUS_ID for correctness.
Traceback (most recent call last):
  File "/home/ubuntu/GPTQModel/examples/quantization/basic_usage_wikitext2.py", line 93, in <module>
    main()
  File "/home/ubuntu/GPTQModel/examples/quantization/basic_usage_wikitext2.py", line 65, in main
    model = GPTQModel.load(pretrained_model_id, quantize_config)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/ubuntu/GPTQModel/venv/lib/python3.12/site-packages/gptqmodel/models/auto.py", line 261, in load
    return cls.from_pretrained(
           ^^^^^^^^^^^^^^^^^^^^
  File "/home/ubuntu/GPTQModel/venv/lib/python3.12/site-packages/gptqmodel/models/auto.py", line 289, in from_pretrained
    model_type = check_and_get_model_type(model_id_or_path, trust_remote_code)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/ubuntu/GPTQModel/venv/lib/python3.12/site-packages/gptqmodel/models/auto.py", line 198, in check_and_get_model_type
    raise TypeError(f"{config.model_type} isn't supported yet.")
TypeError: mistral3 isn't supported yet.

Apr 29 '25 01:04 ewof

@wemoveon2 Please check out PR/branch https://github.com/ModelCloud/GPTQModel/pull/1563 and recompile gptqmodel using

git clone https://github.com/ModelCloud/GPTQModel
cd GPTQModel
git checkout Qubitium-patch-1
pip install -e . --no-build-isolation -v

and check if mistra3 is fixed.

Apr 29 '25 02:04 Qubitium

INFO  ENV: Auto setting PYTORCH_CUDA_ALLOC_CONF='expandable_segments:True' for memory saving.
INFO  ENV: Auto setting CUDA_DEVICE_ORDER=PCI_BUS_ID for correctness.
Using the latest cached version of the dataset since wikitext couldn't be found on the Hugging Face Hub
WARNING:datasets.load:Using the latest cached version of the dataset since wikitext couldn't be found on the Hugging Face Hub
Found the latest cached dataset configuration 'wikitext-2-raw-v1' at /home/ubuntu/.cache/huggingface/datasets/wikitext/wikitext-2-raw-v1/0.0.0/b08601e04326c79dfdd32d625aee71d232d685c3 (last modified on Sun
 Apr 27 03:53:56 2025).
WARNING:datasets.packaged_modules.cache.cache:Found the latest cached dataset configuration 'wikitext-2-raw-v1' at /home/ubuntu/.cache/huggingface/datasets/wikitext/wikitext-2-raw-v1/0.0.0/b08601e04326c79d
fdd32d625aee71d232d685c3 (last modified on Sun Apr 27 03:53:56 2025).
INFO  Estimated Quantization BPW (bits per weight): 4.2875 bpw, based on [bits: 4, group_size: 128]
INFO  Loader: Auto dtype (native bfloat16): `torch.bfloat16`
Traceback (most recent call last):
  File "/home/ubuntu/GPTQModel/examples/quantization/basic_usage_wikitext2.py", line 93, in <module>
    main()
  File "/home/ubuntu/GPTQModel/examples/quantization/basic_usage_wikitext2.py", line 65, in main
    model = GPTQModel.load(pretrained_model_id, quantize_config)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/ubuntu/GPTQModel/gptqmodel/models/auto.py", line 262, in load
    return cls.from_pretrained(
           ^^^^^^^^^^^^^^^^^^^^
  File "/home/ubuntu/GPTQModel/gptqmodel/models/auto.py", line 291, in from_pretrained
    return MODEL_MAP[model_type].from_pretrained(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/ubuntu/GPTQModel/gptqmodel/models/loader.py", line 190, in from_pretrained
    model = cls.loader.from_pretrained(model_local_path, config=config, **model_init_kwargs)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/ubuntu/GPTQModel/venv/lib/python3.12/site-packages/transformers/models/auto/auto_factory.py", line 574, in from_pretrained
    raise ValueError(
ValueError: Unrecognized configuration class <class 'transformers.models.mistral3.configuration_mistral3.Mistral3Config'> for this kind of AutoModel: AutoModelForCausalLM.
Model type should be one of AriaTextConfig, BambaConfig, BartConfig, BertConfig, BertGenerationConfig, BigBirdConfig, BigBirdPegasusConfig, BioGptConfig, BlenderbotConfig, BlenderbotSmallConfig, BloomConfi
g, CamembertConfig, LlamaConfig, CodeGenConfig, CohereConfig, Cohere2Config, CpmAntConfig, CTRLConfig, Data2VecTextConfig, DbrxConfig, DeepseekV3Config, DiffLlamaConfig, ElectraConfig, Emu3Config, ErnieCon
fig, FalconConfig, FalconMambaConfig, FuyuConfig, GemmaConfig, Gemma2Config, Gemma3Config, Gemma3TextConfig, GitConfig, GlmConfig, Glm4Config, GotOcr2Config, GPT2Config, GPT2Config, GPTBigCodeConfig, GPTNe
oConfig, GPTNeoXConfig, GPTNeoXJapaneseConfig, GPTJConfig, GraniteConfig, GraniteMoeConfig, GraniteMoeSharedConfig, HeliumConfig, JambaConfig, JetMoeConfig, LlamaConfig, Llama4Config, Llama4TextConfig, Mam
baConfig, Mamba2Config, MarianConfig, MBartConfig, MegaConfig, MegatronBertConfig, MistralConfig, MixtralConfig, MllamaConfig, MoshiConfig, MptConfig, MusicgenConfig, MusicgenMelodyConfig, MvpConfig, Nemot
ronConfig, OlmoConfig, Olmo2Config, OlmoeConfig, OpenLlamaConfig, OpenAIGPTConfig, OPTConfig, PegasusConfig, PersimmonConfig, PhiConfig, Phi3Config, Phi4MultimodalConfig, PhimoeConfig, PLBartConfig, Prophe
tNetConfig, QDQBertConfig, Qwen2Config, Qwen2MoeConfig, Qwen3Config, Qwen3MoeConfig, RecurrentGemmaConfig, ReformerConfig, RemBertConfig, RobertaConfig, RobertaPreLayerNormConfig, RoCBertConfig, RoFormerCo
nfig, RwkvConfig, Speech2Text2Config, StableLmConfig, Starcoder2Config, TransfoXLConfig, TrOCRConfig, WhisperConfig, XGLMConfig, XLMConfig, XLMProphetNetConfig, XLMRobertaConfig, XLMRobertaXLConfig, XLNetC
onfig, XmodConfig, ZambaConfig, Zamba2Config.

im on transformers 4.51.3

Apr 29 '25 02:04 ewof

@ewof Is MIstra3 a visual (hybrid) model that contains both text and visual input?

Apr 29 '25 02:04 Qubitium

yea and vision_config.model_type is pixtral from the models config.json

Apr 29 '25 02:04 ewof

@ewof Ugh... hybrid models needs manual quantization support as many hybrid model has standard, and non-standarard way of defining how the secondary model (multiple models inside one model config) is defined.

We will try to tackle this with a manual MIstral3 support first, then create a generic code so that all future multi-modal models can work without too much integration work. Right now, the hybrid models are a pain since everyone has not yet decided on how the modeling code (preprocessing) and the forwarding hand-offs should work internally. Wild wild west.

Apr 29 '25 02:04 Qubitium