screenshot-to-code icon indicating copy to clipboard operation
screenshot-to-code copied to clipboard

Investigate open source models

Open theoparis opened this issue 2 years ago • 12 comments

This project looks amazing!! Unfortunately I cannot self host it without paying for OpenAI models - is there a possibility that you could look into alternative open-source models utilizing ggml and/or llama.cpp? So far I have found LLaVa but I'm not sure if it'd work for this project.

theoparis avatar Nov 17 '23 22:11 theoparis

Yes! definitely would like to support LLaVa. Don't think it's going to be as good but worth supporting. Related: https://github.com/abi/screenshot-to-code/issues/15

abi avatar Nov 18 '23 16:11 abi

if we use open source models, it would be a very "satisfying" feeling that the whole project is independent, and does not have a financial roadblock [to an extent] so yes, what kind of models would be suitable for this?

Kishlay-notabot avatar Nov 18 '23 18:11 Kishlay-notabot

LLava, CogVLM are worth experimenting with.

abi avatar Nov 18 '23 20:11 abi

Maybe the image processing can be done by LLaVA v1.5, but the code generation can be passed on DeepSeek Coder?

AltayYuzeir avatar Nov 19 '23 20:11 AltayYuzeir

I decided to merge #62 locally and I tried self-hosting the backend/frontend along with llava and python3 -m llama_cpp.server --model ./2ab9be51b7dc737136b38093316a4d3577d1fb96281f1589adac7841f5b81c43 --clip_model_path ./mmproj.gguf --chat_format llava-1-5 --n_gpu_layers 35. I specified the openai base url as http://localhost:8000/v1 and it seems to work 🚀

The issues I encountered were

  1. I'll have to experiment more but the end result obviously isn't as good as OpenAI/GPT 3/4.
  2. I need a gpu with more VRAM to be able to run larger versions of LLava - 13gb is needed for the larger model and I only have 8gb vram.

theoparis avatar Nov 27 '23 20:11 theoparis

nice! can you share some results? screenshot and clone pairs.

also, you could try running llava 13gb with open router: https://openrouter.ai/models/haotian-liu/llava-13b?tab=stats

abi avatar Nov 27 '23 20:11 abi

Also worth trying: https://twitter.com/Teknium1/status/1731369031918293173

abi avatar Dec 04 '23 02:12 abi

Will this work? https://github.com/vikhyat/moondream

bluusun avatar Jan 30 '24 21:01 bluusun

we test the effect of cogvlm2 is perfect

ccly1996 avatar May 24 '24 23:05 ccly1996