Decouple inference and indexing

Open dconathan opened this issue 1 year ago • 2 comments

We have infra to host the model and I’m looking to just host an “indexing service” that calls the hosted model instead of running inference “locally”. This decouples the “stateless” services (the model) with the stateful services (storing/loading indexes) and has other benefits (ie easier to scale the model to more replicas).

I’d love to reuse byaldi for the indexing logic (have to write my own from scratch otherwise), but the current implementation is pretty tightly coupled.

Do you see this as something byaldi would support, or is this out of scope?

Sep 29 '24 14:09 dconathan

I have a bit of cop-out answer: this is something byaldi will eventually support (I'm hoping to do it whenever I have time), but I'm not sure it'll support it very soon. I'll keep this issue open for now as we'll eventually get around to it!

Oct 03 '24 07:10 bclavie

Hi, let me know if #33 helps.

Oct 07 '24 00:10 jdchawla29