swift
swift copied to clipboard
API support for multi-modal model inference
Current code only supports single or batch inference for multi-modal models (Llava1.6, cogvlm etc) due to lack of vllm support. Any plans to add feature support to enable API support for these models? Maybe with something like https://github.com/sgl-project/sglang?