llama-coder icon indicating copy to clipboard operation
llama-coder copied to clipboard

Prevent checking if model exists on every autocompletion

Open dhuertas opened this issue 1 year ago • 0 comments

Hi!

Thanks for coding this little wonder of extension. Kudos! I've been using it for a bit, and I have noticed that every autocompletion generates an extra request to the /api/tags endpoint in Ollama:

image

I suspect it comes from the call to ollamaCheckModel() in provideInlineCompletionItems():

https://github.com/ex3ndr/llama-coder/blob/996ac715cb722ab7253b217576c66a6311fbd32e/src/prompts/provider.ts#L89

In my view it should not be necessary to send a request to the /api/tags endpoint every time. I am aware the latency it introduces is orders of magnitude lower than the /api/generate cat, but still ... it's extra job for the extension that (in my view) does not need to do.

I'd suggest to go for a different strategy 🤔 Perhaps do the check once and save the list of available models to check locally. Then check again whenever the configuration changes, or every now and then.

Thanks!

dhuertas avatar Mar 02 '24 13:03 dhuertas