Update examples on hugging face that don't use an example image

Open jrp2014 opened this issue 1 year ago • 1 comments

The hugging face examples now have a default prompt ("what are these?"), but no default image. (The results are quite amusing in some cases. )

I think that, where the model requires an image, generate should bail out, saying so, with a bit of help as to what is expected (eg, parameters).

As a separate issue, it should be possible (eg, via a parameter) to run these models offline, or at least with less noise. (I know that I have verbose set to true.)

python -m mlx_vlm.generate --model mlx-community/deepseek-vl2-8bit --max-tokens 100 --temp 0.0

config.json: 100%|█████████████████████████████████████████████████████████████████| 2.89k/2.89k [00:00<00:00, 15.4MB/s]
model-00006-of-00006.safetensors: 100%|████████████████████████████████████████████| 3.58G/3.58G [37:42<00:00, 1.58MB/s]
model-00003-of-00006.safetensors: 100%|████████████████████████████████████████████| 5.17G/5.17G [52:13<00:00, 1.65MB/s]
model-00002-of-00006.safetensors: 100%|████████████████████████████████████████████| 5.11G/5.11G [52:40<00:00, 1.62MB/s]
model-00005-of-00006.safetensors: 100%|████████████████████████████████████████████| 5.11G/5.11G [55:04<00:00, 1.55MB/s]
model-00004-of-00006.safetensors: 100%|████████████████████████████████████████████| 5.11G/5.11G [55:18<00:00, 1.54MB/s]
model-00001-of-00006.safetensors: 100%|████████████████████████████████████████████| 5.25G/5.25G [55:18<00:00, 1.58MB/s]
Fetching 13 files: 100%|███████████████████████████████████████████████████████████████| 13/13 [55:18<00:00, 255.30s/it]
Some kwargs in processor config are unused and will not have any effect: image_mean, add_special_token, patch_size, pad_token, sft_format, downsample_ratio, candidate_resolutions, mask_prompt, image_token, image_std, ignore_id, normalize. ]
Add pad token = ['<｜▁pad▁｜>'] to the tokenizer███████████████████████████████████| 5.25G/5.25G [55:18<00:00, 3.61MB/s]
<｜▁pad▁｜>:2f-00006.safetensors: 100%|████████████████████████████████████████████| 5.11G/5.11G [55:18<00:00, 4.97MB/s]
Add image token = ['<image>'] to the tokenizer
<image>:128815
Added grounding-related tokens
Added chat tokens
==========
Image: [] 

Prompt: <|User|>: What are these?

<|Assistant|>:
It looks like you've included an image, but I can't see it. Could you please describe it to me?
==========
Prompt: 9 tokens, 18.235 tokens-per-sec
Generation: 25 tokens, 77.318 tokens-per-sec
Peak memory: 29.499 GB

Jan 02 '25 21:01 jrp2014

Thanks!

I have fixed the model card in the utils and will manually update all model cards.

Jan 03 '25 03:01 Blaizzy