torchchat icon indicating copy to clipboard operation
torchchat copied to clipboard

Add ability to select models and edit system/assistant prompts

Open vmpuri opened this issue 6 months ago • 1 comments

**Goal: ** Users should be able to select the model from the chat interface and receive a response from that model.

Currently: we just send the request and take the response from whatever model the "server" process has loaded. To load a different model, the user would have to stop the server and re-start it with different CLI args.

Solution: This is a bit tricky. OpenAI can just route the request to a system that already has the model loaded. If we wanted to replicate this functionality, the user's system would quickly run out of memory to have even two instances of low-parameter models loaded. To get around this, we'll just assume that the system will only ever have a single model loaded.

Flow:

  • User will have a dropdown of model IDs (retrieved from the /models endpoint).
  • Upon selecting a different model, the user should press a confirmation button to initiate the model unload/load. If we didn't do this, then someone messing around with the UI might accidentally trigger a multi-second, intensive operation which isn't intended.
  • Pressing this button will submit a query to the server to swap out its models.
  • First, we will validate that the model is still on the server. It should be since we just queried /models, and this is re-queried every time a selection in the dropdown is made. If not, the server should return an appropriate error response.
  • While no model is loaded and the query has been submitted, input elements should be disabled until a response is received
  • The default prompts (i.e. assistant prompt, system prompt) should be enabled or disabled based on whether or not the model supports chat.

Warning This PR is still in draft status and does not yet include server side changes.

vmpuri avatar Aug 22 '24 21:08 vmpuri