llama-coder
                                
                                 llama-coder copied to clipboard
                                
                                    llama-coder copied to clipboard
                            
                            
                            
                        Prevent checking if model exists on every autocompletion
Hi!
Thanks for coding this little wonder of extension. Kudos! I've been using it for a bit, and I have noticed that every autocompletion generates an extra request to the /api/tags endpoint in Ollama:
I suspect it comes from the call to ollamaCheckModel() in provideInlineCompletionItems():
https://github.com/ex3ndr/llama-coder/blob/996ac715cb722ab7253b217576c66a6311fbd32e/src/prompts/provider.ts#L89
In my view it should not be necessary to send a request to the /api/tags endpoint every time. I am aware the latency it introduces is orders of magnitude lower than the /api/generate cat, but still ... it's extra job for the extension that (in my view) does not need to do.
I'd suggest to go for a different strategy 🤔 Perhaps do the check once and save the list of available models to check locally. Then check again whenever the configuration changes, or every now and then.
Thanks!