transformers.js
transformers.js copied to clipboard
404 when trying Qwen in V3
Question
This is probably just because V3 is a work in progress, but I wanted to make sure.
When trying to run Qwen 1.5 - 0.5B it works with the V2 script, but when swapping to V3 I get a 404 not found.
type not specified for model. Using the default dtype: q8.
GET https://huggingface.co/Xenova/Qwen1.5-0.5B-Chat/resolve/main/onnx/model_quantized.onnx 404 (Not Found)
It seems V3 is looking for a file that was renamed 3 months ago. Rename onnx/model_quantized.onnx to onnx/decoder_model_merged_quantized.onnx
I've tried setting dtype
to 16 and 32, which does change the URL it tries to get, but those URL's also do not exist :-D
e.g. https://huggingface.co/Xenova/Qwen1.5-0.5B-Chat/resolve/main/onnx/model_fp16.onnx
when using dtype: 'fp16'
.
Is there something I can do to make V3 find the correct files?
(I'm still trying to find that elusive small model with a large context size to do document summarization with)
#745
Hi there 👋 v3 will use the name
model
instead ofdecoder_merged_model
, as the latter is the result of a legacy conversion process which created multiple versions of the model (w/ and w/o past key value inputs). So, this change isn't needed.If you want to override the behaviour yourself, you can use the
model_file_name
option when loading the model.
Hello! Just a beginner here, could someone help me demonstrate with example code how to override the behaviour yourself using the model_file_name option when loading the model
@JohnReginaldShutler
model
: The default filename prefix can be change using the option model_file_name.
_quantized.onnx
: The default filename suffix cannot be change, and will depend on the precision used.
Example:
// using pipeline function
let pipe = await pipeline('text-generation', 'Xenova/Qwen1.5-0.5B-Chat', {model_file_name: 'decoder_model_merged'})
// using AutoModel class
let model = await AutoModel.from_pretrained('Xenova/Qwen1.5-0.5B-Chat', {model_file_name:'decoder_model_merged'})
// will fetch decoder_model_merged_quantized.onnx