llama-coder Prevent checking if model exists on every autocompletion

Prevent checking if model exists on every autocompletion

Open dhuertas opened this issue 1 year ago • 0 comments

Hi!

Thanks for coding this little wonder of extension. Kudos! I've been using it for a bit, and I have noticed that every autocompletion generates an extra request to the /api/tags endpoint in Ollama:

I suspect it comes from the call to ollamaCheckModel() in provideInlineCompletionItems():

https://github.com/ex3ndr/llama-coder/blob/996ac715cb722ab7253b217576c66a6311fbd32e/src/prompts/provider.ts#L89

In my view it should not be necessary to send a request to the /api/tags endpoint every time. I am aware the latency it introduces is orders of magnitude lower than the /api/generate cat, but still ... it's extra job for the extension that (in my view) does not need to do.

I'd suggest to go for a different strategy 🤔 Perhaps do the check once and save the list of available models to check locally. Then check again whenever the configuration changes, or every now and then.

Thanks!

Mar 02 '24 13:03 dhuertas

llama-coder llama-coder copied to clipboard

Prevent checking if model exists on every autocompletion

llama-coder
llama-coder copied to clipboard