moondream icon indicating copy to clipboard operation
moondream copied to clipboard

encode_image() broken after latest commits

Open kiriri opened this issue 1 year ago • 1 comments

Ubuntu 22.04, gtx 2080 and m40 both result in an all NaN tensor when running the encode_image() function.

device, dtype = detect_device()
model_id = "vikhyatk/moondream1"
tokenizer = Tokenizer.from_pretrained(model_id)
moondream = Moondream.from_pretrained(model_id).to(device=device, dtype=dtype)
moondream.eval()
image = Image.open("/test.png")
image_embeds = moondream.encode_image(image)

A temporary fix is :

Moondream.from_pretrained(model_id, revision="1e62d51745be03c0d3a5c582afcada6c1f98f454")

No Errors were printed. packages are according to the requirements.txt .

kiriri avatar Feb 07 '24 18:02 kiriri

oh no! looking into this

vikhyat avatar Feb 07 '24 18:02 vikhyat

After no one else chimed in having the same problem I got suspicious. And lo and behold, the issue was mostly just me being dumb. Moondream.from_pretrained(model_id) always gets the newest model. And apparently this new model had a different architecture in regards to the encoder? So it REQUIRED the most up to date version of the repository, with the new encode_image function. Which I had not git pulled. I think in the future it would be helpful if breaking changes in the model used a different version numbered name so existing code doesn't break from one day to the next. (Some people may use modified forks of this repo for example, and huggingface will download the new model without asking). But at the same time I really should have tried pulling before opening an issue :sweat_smile:

kiriri avatar Feb 08 '24 09:02 kiriri

Apologies for this, lesson learned on my part! Will make sure to bump version if there's ever a non-backward compatible change.

vikhyat avatar Feb 23 '24 01:02 vikhyat