[IDEA] 增加自动缩小视觉输入的截图的设置 / Add setting for auto resizing vision input image
这个功能请求是用来解决什么问题的? / Is your feature request related to a problem? Please describe.
Ollama最近更新了Qwen VL支持所以我试了试发现Ollama会在我开启屏幕共享后出现类似崩溃的情况(Ollama其实还在运行,ollama ps显示模型也还是已加载的状态,但是显存已经清空了)。我试了用分辨率较低的图片没问题,用LM Studio(会把图片全缩小到宽度500像素)作为Open LLM VTuber的LLM后端也没问题。我用的是4k显示器,所以我怀疑是Open LLM VTuber和Ollama都不会自动缩小图片的原因。
Ollama just released support for Qwen VL so I get to try it out and I find Ollama would kind of crash (Ollama is still running and ollama ps shows the model is still loaded but my vram is emptied) if I turn on screen sharing. If I call the model with a lower resolution image it works fine. And if I use LM Studio (which resizes all input image to 500px wide) with Open LLM VTuber it also works fine. I use a 4k display so I suspect it's because neither Open LLM VTuber or Ollama is resizing the image.
您期望的解决方案是什么? / Describe the solution you'd like
增加一条视觉输入分辨率的设置项。
Add a setting entry for vision input resolution.
此功能为何对 Open-LLM-VTuber 很重要? / Why is this important for Open-LLM-VTuber?
就算没有崩溃的问题,过大的图片还是会让速度变得很慢。用户应该能在速度和视觉输入的准确性之间权衡。
Even if high res image doesn't crash the model, it would still slow down significantly. Users should be able to choose how to balance between speed and accuracy of vision input.
您考虑过哪些替代方案? / Describe alternatives you've considered
替代方案就只能是用一个会自动缩小图片的LLM后端了。LM Studio会把所有图片缩小到宽500像素,对于电脑屏幕来说根本就看不清。
更新:可以改用摄像头选项搭配OBS虚拟摄像机,在OBS中设置输出分辨率
Alternatives would have to be choosing a LLM backend that can do the resizing. LM Studio is hard coded to 500px wide which sucks for a computer screen.
Update: You can use the camera option with OBS virtual camera instead, and set the output resolution in OBS.
您是否愿意参与开发此功能? / Would you like to work on this issue?
No.
补充信息 / Additional context
N/A
Thank you for your feedback. We will fix it soon.
Thank you for your feedback. We will fix it soon.
Thanks!
While this is being fixed, I found a workaround: Do not use the built-in screen capture option, but instead use the camera option with OBS virtual camera, so you can set the output resolution in OBS. This is probably going to mislead LLM to think it’s a camera image instead of screen capture though.
Fixed by adding image compression settings, it will be released in the new version soon. We will close this issue.