Enumerate Ollama models on the server pointed to by :host directive
Presently, I have to add/remove all Ollama models manually modifying the ":models" string. Would it be possible for GPTEL to enumerate the existing models on the fly when the transient menu opens, just like it is done for Anthropic, Gemini and OpenAI models? Thanks for considering my request and apologies if I am missing something.
This has come up before, see #394. Fetching it easy enough, the question is when it should be done.
We don't want gptel to be making network request at the time that (gptel-make-ollama ...) is evaluated, because this code lives in the user's configuration. I would be quite annoyed if evaluating my settings for a package in a use-package block caused network requests to be created.
I understand. Perhaps in the Transient menu there could be an option under the "models" submenu to populate the Ollama list? The way things stand now, as I am sure you understand, we have to modify the :models every time a model is added or removed. Therefore, some kind of dynamic polling for Ollama models would make this a consistent behavior with how other services are enumerated (Anthropic, Gemini, Open AI, etc). Thank you for considering this request.
as I am sure you understand, we have to modify the :models every time a model is added or removed.
Do you add and remove models from Ollama often enough that this is a concern? I'm curious. I can't run Ollama right now, but back when I had an Ollama-capable PC, I changed models maybe twice in three months.
there could be an option under the "models" submenu to populate the Ollama list
There is no "models" submenu.
Therefore, some kind of dynamic polling for Ollama models would make this a consistent behavior with how other services are enumerated
True. I just don't know where to put this code in the usual course of using gptel.
There is one additional concern. How should conflicts between the :models field of the backend definition and the polled list of models be resolved? A simple merge will not work, here's an example of why:
Models explicitly defined for gptel:
:models
'(model1
(model2
:capabilities (media nosystem)
:description "description2"
:mime-types ("image/jpeg" "image/png"))
model3)
Models returned from API call, after processing:
:models
'((model1
:description "description1"
:context-window 32)
(model2
:capabilities (media)
:description "description2_alternate"
:mime-types ("image/png" "image/heic")))
It's clear how to update model1 here. But the other differences are not simple to resolve:
- Should
model3be removed? -
model2includes a plist with arbitrary metadata attached. How should the two plists be merged?
In the branch feature-ollama-auto-update, I've added a function to fetch and merge the available Ollama models into the list of the active backend, assuming the Ollama backend is the active one. It can be used as follows:
(gptel--ollama-fetch-models) ;; Update active backend if it's of type Ollama
(gptel--ollama-fetch-models "Ollama") ;; Update backend named "Ollama"
;; OR
(gptel--ollama-fetch-models some-backend) ;; Update backend "some-backend"
You can test it out. The merge strategy is non-destructive, and when there is a conflict it prioritizes provided metadata over metadata returned by Ollama.
The problem, as discussed, is where/when this code should run.
THis seems to work fine. As to when the code should be run, I am not sure I am in a position to advise. From the user side, I think the code could be evaluated when the gptel-menu is executed and the list of models is populated in the transient menu.
I have tested commit https://github.com/karthink/gptel/commit/cbe6f304e5159838686d2163ae733dbba9a7b406 and it works like a charm.
I am using an advice around gptel-menu currently and the rationale is that I want ollama to be the source of truth.
(setq gptel-model 'qwen2.5-coder:32b-instruct-q6_K
gptel-backend (gptel-make-ollama "Ollama"
:host (ar-emacs-ollama-host-w-port)
:stream t
:models '()))
(advice-add 'gptel-menu :before (lambda () (gptel--ollama-fetch-models "Ollama")))
Gemini and me have prepared a function to fetch models list from any openai compatible servers (I'm using llama.cpp + llama-swap locally). Maybe would help somebody with the same issues.
;;;###autoload
(defun vd/llama-list-remote-models (llama-api-endpoint)
"List llama model names from a local or remote openai compatible REST API."
(interactive "sllama API Endpoint (e.g., http://localhost:12434): ")
(require 'json)
(require 'url)
(require 'cl-lib)
(let* ((url (or llama-api-endpoint "http://localhost:12434"))
;; Construct the full URL for the /v1/models endpoint.
(full-url (format "%s/v1/models" url))
;; Use with-temp-buffer for automatic cleanup, which is cleaner
;; than unwind-protect and manual kill-buffer.
(model-names
(with-temp-buffer
(message "Fetching llama models from %s..." full-url)
;; Use `url-insert-file-contents`. A non-nil return value indicates success.
(if (condition-case-unless-debug err
(url-insert-file-contents full-url)
;; Catch potential errors during the HTTP request.
(error (message "Error fetching URL: %s" err) nil))
;; --- BEGIN SUCCESSFUL FETCH ---
(let* ((raw-response (buffer-string))
(json-data nil))
;; Try to parse the raw response. The API might return a JSON object
;; directly, or it might wrap the JSON object inside a JSON string.
(condition-case err
(setq json-data (json-read-from-string raw-response))
(error
(message "Error parsing initial JSON response: %s" err)
(setq json-data nil)))
;; If the first parse resulted in a string, it means the JSON was
;; wrapped. We need to parse this inner string to get the object.
(when (stringp json-data)
(condition-case err
(setq json-data (json-read-from-string json-data))
(error
(message "Error parsing unwrapped JSON string: %s" err)
(setq json-data nil))))
;; Now, `json-data` should be a parsed JSON structure.
;; Process it, handling either hash-table or alist representations.
(cond
;; CASE 1: Parsed as a hash-table (e.g., modern json.el default).
((hash-table-p json-data)
(let ((models (gethash "data" json-data)))
(if (vectorp models)
(let ((names (mapcar (lambda (model)
(when (hash-table-p model)
(gethash "id" model)))
(cl-coerce models 'list))))
(message "Successfully fetched %d models." (length (cl-remove-if-not #'stringp names)))
(cl-remove-if-not #'stringp names)) ; Return names
(message "llama API response format unexpected ('models' key not a vector). Response: %S" json-data))))
;; CASE 2: Parsed as an association list (alist).
((and (listp json-data) (assoc 'data json-data))
(let* ((models (cdr (assoc 'data json-data)))
;; The array part could be a vector or a list.
(model-list (if (vectorp models) (cl-coerce models 'list) models)))
(if (listp model-list)
(let ((names (mapcar (lambda (model)
(when (listp model) ; Should be an alist
(cdr (assoc 'id model))))
model-list)))
(message "Successfully fetched %d models." (length (cl-remove-if-not #'stringp names)))
(cl-remove-if-not #'stringp names)) ; Return names
(message "llama API response format unexpected ('models' value not a list/vector). Response: %S" json-data))))
;; DEFAULT: Unrecognized format.
(t
(message "llama API response was not a valid or recognized JSON object. Final parsed data: %S" json-data)
nil))) ; Return nil
;; --- END SUCCESSFUL FETCH ---
;; --- BEGIN FAILED FETCH ---
(message "Failed to retrieve data from llama API at %s. Check URL and server." full-url)))))
;; The value of the `with-temp-buffer` block is the list of names (or nil on failure).
;; Return this value.
model-names))