Decouple inference and indexing
We have infra to host the model and I’m looking to just host an “indexing service” that calls the hosted model instead of running inference “locally”. This decouples the “stateless” services (the model) with the stateful services (storing/loading indexes) and has other benefits (ie easier to scale the model to more replicas).
I’d love to reuse byaldi for the indexing logic (have to write my own from scratch otherwise), but the current implementation is pretty tightly coupled.
Do you see this as something byaldi would support, or is this out of scope?
I have a bit of cop-out answer: this is something byaldi will eventually support (I'm hoping to do it whenever I have time), but I'm not sure it'll support it very soon. I'll keep this issue open for now as we'll eventually get around to it!
Hi, let me know if #33 helps.