Dr Sujit Vasanth comments

Results 35 comments of


                                            Dr Sujit Vasanth

Multiple image embeds in one prompt?

It took a bit of prompt engineering and image stitching but I got moondream1 comparing rudimentary images my prompts: - here are 2 webcam images, upper and lower; is the...

Training Code

why not run on the cloud? have 1x3090 but realised my investment was more about convenience than cost efficency

Memory/continuity?

I wonder if at present you can use opencv to stitch the 2 images and ask for moondream to find differences between say left and right images. programmatically you can...

Please consider creating a quantized version or even better a CoreML model

Hi I've got it working with transformers so you can show it items on webcam and it will automatically detect on scene change ... my prompt is "Ignore mouse, pad,...

Please consider creating a quantized version or even better a CoreML model

@Yazorp presumably that gives a performance speed advantage? can you quantify it? also I have posted a model request for TheBloke's discord server to quantize the model. Please upvote the...

Please consider creating a quantized version or even better a CoreML model

@vikhyat thanks for moving the custom model code to hugging face.. as you rightly say much better and lighter weight... I was able to almost switch in and out of...

Please consider creating a quantized version or even better a CoreML model

@oliverbob I think the reason is the hugging face model and GitHub repos have been updated since I posted the original code.. for the better I may add as the...

Please consider creating a quantized version or even better a CoreML model

There is a gradio example in the original repo.. https://github.com/vikhyat/moondream/blob/main/gradio_demo.py might be out of date now but can be amended

Try openchat LLM instead of Phi1.5

@axrwl quantised latest openchat takes only 4Gb https://huggingface.co/openchat/openchat-3.5-0106 main problem is the only working quantised versions for vision llms I've seen is bitsandbytes transformers library does mot seem to support...

Openchat, quantisation, multiimage

@LinB203 thge originall openchat7b v1102 is only 18Gb https://huggingface.co/openchat/openchat-3.5-1210/tree/main why is your vision model 30+Gb?. It wont fit on my rtx 3090 even for inference so will only be usable...