byaldi
byaldi copied to clipboard
Call Remote LLM API Instead Of Inferrencing Locally
Currently, Byaldi requires local inference by RAG = RAGMultiModalModel.from_pretrained("vidore/colpali-v1.2"). Can you refactor the RAGMultiModalModel class to call a remote VLLM API to complete the work? E.g. RAGMultiModalModel.from_api("https://localhost:3000/v1/completions").