llm-foundry
llm-foundry copied to clipboard
inference - is it optimized for api usage?
❓ Question
Does the inference command have a public api endpoint or batching of requests?
Additional context
I was wondering if this could be deployed on a production scale
@mantrakp2004 : Inference doesn't have public endpoints. The only public way to interact with these model is thorough HF interface. For example, https://huggingface.co/spaces/mosaicml/mpt-30b-chat
For private production scale usage, please get in touch with our team. https://docs.mosaicml.com/en/latest/inference.html