[BUG] Opening of model from HF fails with ValueError: Missing parameters: lm_head.weight.
Describe the bug
Opening mlx-community/Qwen3-Embedding-8B-4bit-DWQ from HF fails with ValueError: Missing parameters: lm_head.weight.
The error output is:
Traceback (most recent call last):
File "/Users/guy/Library/Application Support/CodeRunner/Debuggers/pdb.crDebugger/pdb_server.py", line 40, in main
p._run(pdb._ScriptTarget(p.mainpyfile))
File "/opt/local/Library/Frameworks/Python.framework/Versions/3.12/lib/python3.12/pdb.py", line 1754, in _run
self.run(target.code)
File "/opt/local/Library/Frameworks/Python.framework/Versions/3.12/lib/python3.12/bdb.py", line 627, in run
exec(cmd, globals, locals)
File "", line 1, in
File "/Users/guy/Library/Application Support/CodeRunner/Unsaved/Untitled.py", line 5, in
model, tokenizer = load("/Users/guy/.lmstudio/models/mlx-community/Qwen3-Embedding-8B-4bit-DWQ")
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/guy/Library/Python/3.12/lib/python/site-packages/mlx_lm/utils.py", line 241, in load
model, config = load_model(model_path, lazy)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/guy/Library/Python/3.12/lib/python/site-packages/mlx_lm/utils.py", line 202, in load_model
model.load_weights(list(weights.items()), strict=strict)
File "/Users/guy/Library/Python/3.12/lib/python/site-packages/mlx/nn/layers/base.py", line 181, in load_weights
raise ValueError(f"Missing parameters: {missing}.")
To Reproduce
Include code snippet
from mlx_lm import load, generate
model, tokenizer = load("mlx-community/Qwen3-Embedding-8B-4bit-DWQ")
prompt = "hello"
if tokenizer.chat_template is not None:
messages = [{"role": "user", "content": prompt}]
prompt = tokenizer.apply_chat_template(
messages, add_generation_prompt=True
)
response = generate(model, tokenizer, prompt=prompt, verbose=True)
Expected behavior Some kind of output, no error
Desktop (please complete the following information):
- OS Version: macOS 15.5
- mlx: 0.26.1
- mlx-lm: 0.25.2
@workflowsguy https://huggingface.co/mlx-community/Qwen3-Embedding-8B-4bit-DWQ/discussions/2
mlx-lm doesn't support embedding models (at least for qwen2 and qwen3).
I have been struggling to convert nomic-embed-code (based on qwen2) to mlx.
Thank you for pointing this out. As a novice user, I was simply following the "MLX" tag, which is obviously misleading/premature(?) here