moondream It won't roast me. Remove censorship.

If I ask it to roast me (my picture), it just says "no". Even Twitter's Grok allows roasting.

Jan 28 '24 23:01 fastrocket

Sounds annoying, what was the prompt? "Say something offensive about the image" worked for me FWIW

Jan 29 '24 02:01 vikhyat

"Roast me" It kept saying "No" even when I tried different face images I found on Google. Interestingly, your prompt does work. It's not to a level of roasting like Grok, and it tends to be somewhat diplomatic, but at least it's not as censored as I thought.

By the way, I'm fairly impressed with how capable Moondream is. I tried testing some basic memes on MoE LLaVA at https://huggingface.co/spaces/LanguageBind/MoE-LLaVA and it could not read the text as well as Moondream. And Moondream seems to answer the man ironing behind a taxi better than MoE LLaVA.

Jan 30 '24 07:01 fastrocket

If I had to guess it's probably never seen the phrase "roast me" during the training process... the text model (phi-1.5) was trained mostly on synthetic data from ChatGPT so that's where a lot of the cautious behavior is coming from, and it's definitely not in the training data I used to add vision capabilities. Would probably work better with a model like StableVLM but I don't like the license on that one.

And thank you! I put a lot of work into making it actually work instead of benchmark hacking. This type of feedback is super useful so I know what types of data to add in future training runs.

Jan 30 '24 07:01 vikhyat