mistral3
can someone reply here when/if mistral3 support is added (not sure how ISTA-DASLab/Mistral-Small-3.1-24B-Instruct-2503-GPTQ-4b-128g was made)
in the meantime if anyone else is looking for a gptq mistral small jeffcookio/Mistral-Small-3.1-24B-Instruct-2503-HF-gptqmodel-4b-128g worked for me
@ewof What error do you get with Mistral3?
INFO ENV: Auto setting PYTORCH_CUDA_ALLOC_CONF='expandable_segments:True' for memory saving.
INFO ENV: Auto setting CUDA_DEVICE_ORDER=PCI_BUS_ID for correctness.
Traceback (most recent call last):
File "/home/ubuntu/GPTQModel/examples/quantization/basic_usage_wikitext2.py", line 93, in <module>
main()
File "/home/ubuntu/GPTQModel/examples/quantization/basic_usage_wikitext2.py", line 65, in main
model = GPTQModel.load(pretrained_model_id, quantize_config)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/ubuntu/GPTQModel/venv/lib/python3.12/site-packages/gptqmodel/models/auto.py", line 261, in load
return cls.from_pretrained(
^^^^^^^^^^^^^^^^^^^^
File "/home/ubuntu/GPTQModel/venv/lib/python3.12/site-packages/gptqmodel/models/auto.py", line 289, in from_pretrained
model_type = check_and_get_model_type(model_id_or_path, trust_remote_code)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/ubuntu/GPTQModel/venv/lib/python3.12/site-packages/gptqmodel/models/auto.py", line 198, in check_and_get_model_type
raise TypeError(f"{config.model_type} isn't supported yet.")
TypeError: mistral3 isn't supported yet.
@wemoveon2 Please check out PR/branch https://github.com/ModelCloud/GPTQModel/pull/1563 and recompile gptqmodel using
git clone https://github.com/ModelCloud/GPTQModel
cd GPTQModel
git checkout Qubitium-patch-1
pip install -e . --no-build-isolation -v
and check if mistra3 is fixed.
INFO ENV: Auto setting PYTORCH_CUDA_ALLOC_CONF='expandable_segments:True' for memory saving.
INFO ENV: Auto setting CUDA_DEVICE_ORDER=PCI_BUS_ID for correctness.
Using the latest cached version of the dataset since wikitext couldn't be found on the Hugging Face Hub
WARNING:datasets.load:Using the latest cached version of the dataset since wikitext couldn't be found on the Hugging Face Hub
Found the latest cached dataset configuration 'wikitext-2-raw-v1' at /home/ubuntu/.cache/huggingface/datasets/wikitext/wikitext-2-raw-v1/0.0.0/b08601e04326c79dfdd32d625aee71d232d685c3 (last modified on Sun
Apr 27 03:53:56 2025).
WARNING:datasets.packaged_modules.cache.cache:Found the latest cached dataset configuration 'wikitext-2-raw-v1' at /home/ubuntu/.cache/huggingface/datasets/wikitext/wikitext-2-raw-v1/0.0.0/b08601e04326c79d
fdd32d625aee71d232d685c3 (last modified on Sun Apr 27 03:53:56 2025).
INFO Estimated Quantization BPW (bits per weight): 4.2875 bpw, based on [bits: 4, group_size: 128]
INFO Loader: Auto dtype (native bfloat16): `torch.bfloat16`
Traceback (most recent call last):
File "/home/ubuntu/GPTQModel/examples/quantization/basic_usage_wikitext2.py", line 93, in <module>
main()
File "/home/ubuntu/GPTQModel/examples/quantization/basic_usage_wikitext2.py", line 65, in main
model = GPTQModel.load(pretrained_model_id, quantize_config)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/ubuntu/GPTQModel/gptqmodel/models/auto.py", line 262, in load
return cls.from_pretrained(
^^^^^^^^^^^^^^^^^^^^
File "/home/ubuntu/GPTQModel/gptqmodel/models/auto.py", line 291, in from_pretrained
return MODEL_MAP[model_type].from_pretrained(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/ubuntu/GPTQModel/gptqmodel/models/loader.py", line 190, in from_pretrained
model = cls.loader.from_pretrained(model_local_path, config=config, **model_init_kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/ubuntu/GPTQModel/venv/lib/python3.12/site-packages/transformers/models/auto/auto_factory.py", line 574, in from_pretrained
raise ValueError(
ValueError: Unrecognized configuration class <class 'transformers.models.mistral3.configuration_mistral3.Mistral3Config'> for this kind of AutoModel: AutoModelForCausalLM.
Model type should be one of AriaTextConfig, BambaConfig, BartConfig, BertConfig, BertGenerationConfig, BigBirdConfig, BigBirdPegasusConfig, BioGptConfig, BlenderbotConfig, BlenderbotSmallConfig, BloomConfi
g, CamembertConfig, LlamaConfig, CodeGenConfig, CohereConfig, Cohere2Config, CpmAntConfig, CTRLConfig, Data2VecTextConfig, DbrxConfig, DeepseekV3Config, DiffLlamaConfig, ElectraConfig, Emu3Config, ErnieCon
fig, FalconConfig, FalconMambaConfig, FuyuConfig, GemmaConfig, Gemma2Config, Gemma3Config, Gemma3TextConfig, GitConfig, GlmConfig, Glm4Config, GotOcr2Config, GPT2Config, GPT2Config, GPTBigCodeConfig, GPTNe
oConfig, GPTNeoXConfig, GPTNeoXJapaneseConfig, GPTJConfig, GraniteConfig, GraniteMoeConfig, GraniteMoeSharedConfig, HeliumConfig, JambaConfig, JetMoeConfig, LlamaConfig, Llama4Config, Llama4TextConfig, Mam
baConfig, Mamba2Config, MarianConfig, MBartConfig, MegaConfig, MegatronBertConfig, MistralConfig, MixtralConfig, MllamaConfig, MoshiConfig, MptConfig, MusicgenConfig, MusicgenMelodyConfig, MvpConfig, Nemot
ronConfig, OlmoConfig, Olmo2Config, OlmoeConfig, OpenLlamaConfig, OpenAIGPTConfig, OPTConfig, PegasusConfig, PersimmonConfig, PhiConfig, Phi3Config, Phi4MultimodalConfig, PhimoeConfig, PLBartConfig, Prophe
tNetConfig, QDQBertConfig, Qwen2Config, Qwen2MoeConfig, Qwen3Config, Qwen3MoeConfig, RecurrentGemmaConfig, ReformerConfig, RemBertConfig, RobertaConfig, RobertaPreLayerNormConfig, RoCBertConfig, RoFormerCo
nfig, RwkvConfig, Speech2Text2Config, StableLmConfig, Starcoder2Config, TransfoXLConfig, TrOCRConfig, WhisperConfig, XGLMConfig, XLMConfig, XLMProphetNetConfig, XLMRobertaConfig, XLMRobertaXLConfig, XLNetC
onfig, XmodConfig, ZambaConfig, Zamba2Config.
im on transformers 4.51.3
@ewof Is MIstra3 a visual (hybrid) model that contains both text and visual input?
yea and vision_config.model_type is pixtral from the models config.json
@ewof Ugh... hybrid models needs manual quantization support as many hybrid model has standard, and non-standarard way of defining how the secondary model (multiple models inside one model config) is defined.
We will try to tackle this with a manual MIstral3 support first, then create a generic code so that all future multi-modal models can work without too much integration work. Right now, the hybrid models are a pain since everyone has not yet decided on how the modeling code (preprocessing) and the forwarding hand-offs should work internally. Wild wild west.