moondream icon indicating copy to clipboard operation
moondream copied to clipboard

API Endpoint Recommendation

Open RonanKMcGovern opened this issue 10 months ago • 3 comments

Would you have a recommendation on how to most easily set up an API endpoint that can dynamically batch requests (e.g. like vLLM)?

I realise this is probably quite involved, but perhaps you have some suggestions on quickest paths to hack a working solution.

RonanKMcGovern avatar Apr 17 '24 13:04 RonanKMcGovern

I too am wondering this and have started looking into making a handler.py for deployment using hugging face inference endpoints

Bedrovelsen avatar Apr 19 '24 02:04 Bedrovelsen

Just created a pull request to add support to vLLM: https://github.com/vllm-project/vllm/pull/4228

vikhyat avatar Apr 20 '24 23:04 vikhyat

That’s great, thanks

On Sun 21 Apr 2024 at 00:55, vik @.***> wrote:

Just created a pull request to add support to vLLM: vllm-project/vllm#4228 https://github.com/vllm-project/vllm/pull/4228

— Reply to this email directly, view it on GitHub https://github.com/vikhyat/moondream/issues/87#issuecomment-2067817180, or unsubscribe https://github.com/notifications/unsubscribe-auth/ASVG6CWIJPZIZHN3B24IPOLY6L56XAVCNFSM6AAAAABGLLZGOGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDANRXHAYTOMJYGA . You are receiving this because you authored the thread.Message ID: @.***>

RonanKMcGovern avatar Apr 21 '24 09:04 RonanKMcGovern