InternVL
InternVL copied to clipboard
great work. But inference is too slow
Great work with this. The OCR is on latin text really good. Can you propose how to run this faster. i am already running it on a100 79GB using transformers, but its really slow.Can I maybe use something like vllm/sglang or TGi(huggingface) to make the inference faster?
Did you find something? I'm looking for same
Hello, we also very much hope to make this model run faster and have been attempting some improvements in this area recently. I will keep you updated on any progress.
Hello, we also very much hope to make this model run faster and have been attempting some improvements in this area recently. I will keep you updated on any progress.
@czczup How to run this model as inference? I can't find it on vLLM or TGI. Any suggestions how to deploy it as an endpoint
@Iven2132 I created a fastapi + uvicorn endpoint for now.
Hope the guide in this PR can do a favor