ragna Add a chat-identifier endpoint

Feature description

I would like to propose a new API endpoint that will only return a list of previous chat IDs. More specifically, the endpoint would return a list of mappings from chat ID to chat name for a user.

Value and/or benefit

This is useful functionality for interacting with specific historical chats. For example, if I would like to only show messages from a certain chat. Currently, it's necessary to retrieve all information about all chats to get this sort of information. This can become an issue for UI development when there are lots of users/users with a large amount of chats.

Anything else?

I have proposed an example in #338

Mar 04 '24 01:03 nenb

I don't really understand what the issue is with the current approach. GET /chats will give you all of the chats of the user.

https://github.com/Quansight/ragna/blob/c1c159d196db99cb0372cddd4abf8ecee3ce255d/ragna/deploy/_api/core.py#L203-L206

If you know the ID upfront, you can also just get the data of this particular chat with GET /chats/{id}

https://github.com/Quansight/ragna/blob/c1c159d196db99cb0372cddd4abf8ecee3ce255d/ragna/deploy/_api/core.py#L208-L211

This can become an issue for UI development when there are lots of users/users with a large amount of chats.

How does the number of users play into this? Both endpoints listed above are user-specific.

Mar 04 '24 13:03 pmeier

I don't really understand what the issue is with the current approach. GET /chats will give you all of the chats of the user.

It gives all the messages, sources etc for all chats. The size of this object that is sent over the network grows over time and also requires a bunch of logic on the consumer side for extracting chat IDs and chat names.

If you know the ID upfront, you can also just get the data of this particular chat with GET /chats/{id}

That's the point of this PR - how can I know the ID (for a conversation that I created in a previous session)? I have to hit the /chats endpoint above.

How does the number of users play into this? Both endpoints listed above are user-specific.

I confused things here, apologies. There is the performance of the UI (see previous comments) and the performance of the API: If we have a central service hosting the API, then the API will need to load a user's entire chat history into memory just to get the names of previous chats. For a large number of users, this can create unnecessary load.

Although I now realise that #338 doesn't actually address this API issue, only the UI issue.

Ultimately, it's a QoL feature request. It's possible to get the behaviour I want with a GET to /chats, but it's not very user-friendly and has possible performance implications for anyone centrally hosting ragna.

Mar 04 '24 14:03 nenb

Ultimately, it's a QoL feature request. It's possible to get the behaviour I want with a GET to /chats, but it's not very user-friendly and has possible performance implications for anyone centrally hosting ragna.

Could you clarify the use case you have? Why do you need only this information rather than what is returned by GET /chats?

Mar 04 '24 14:03 pmeier

Could you clarify the use case you have? Why do you need only this information rather than what is returned by GET /chats?

See my initial message.

This is useful functionality for interacting with specific historical chats. For example, if I would like to only show messages from a certain chat. Currently, it's necessary to retrieve all information about all chats to get this sort of information.

Mar 04 '24 14:03 nenb

Here is how we are currently doing it in our UI:

When starting the UI, we hit GET /chats, store its result, and populate the left sidebar, i.e. the chat selection

https://github.com/Quansight/ragna/blob/be717684fd02dcf9695fe38804dc968e4e5e837c/ragna/deploy/_ui/main_page.py#L41-L60
When selecting a new chat, we pass the full chat object to the central view, which holds the chat interface

https://github.com/Quansight/ragna/blob/be717684fd02dcf9695fe38804dc968e4e5e837c/ragna/deploy/_ui/main_page.py#L29-L30

https://github.com/Quansight/ragna/blob/be717684fd02dcf9695fe38804dc968e4e5e837c/ragna/deploy/_ui/main_page.py#L91-L96

https://github.com/Quansight/ragna/blob/be717684fd02dcf9695fe38804dc968e4e5e837c/ragna/deploy/_ui/central_view.py#L360-L361
In the central view we render the message objects from the chat object that we got passed

https://github.com/Quansight/ragna/blob/be717684fd02dcf9695fe38804dc968e4e5e837c/ragna/deploy/_ui/central_view.py#L402-L418

@nenb IIUC, here is what you want to do instead:

When starting the UI, we only pull in the chat IDs and names, which is sufficient for populating the chat selection.
Whenever we select a chat, e.g. on startup when at least one is available, we hit GET /chats/{id} and pass this result on to the central view.
Same as above: we render the message objects from the passed chat object.

Compared to our method this has the upside that on startup we only need to load the data we actually require and thus can be quite a bit faster if there are multiple chats with a ton of messages. However, now we need to hit GET /chats/{id} when the user switches chats, introducing a slowdown at a different place. This could be counteracted by performing the GET /chats/{id} calls in the background after the initial view is rendered and cache their result. Basically building the the output of get /chats incrementally. Such a system is of course quite a bit more complex than what we have now.

Is that an accurate description of what you want to do?

Mar 05 '24 09:03 pmeier

Is that an accurate description of what you want to do?

More or less. There is a branch here. I'm not sure if it will be particularly helpful, as it has lots of changes, but the idea is that actions in the left sidebar drive the rest of the UI.

However, now we need to hit GET /chats/{id} when the user switches chats, introducing a slowdown at a different place.

My experience so far is that this is negligible compared to the time taken to actually rebuild and display the chat messages when switching between chats (which also affects the current version).

Here is a screenshot if what the branch currently looks like: Screenshot 2024-03-05 at 12 50 29

Mar 05 '24 15:03 nenb

My experience so far is that this is negligible compared to the time taken to actually rebuild and display the chat messages when switching between chats (which also affects the current version).

In that case I'm confused again. In https://github.com/Quansight/ragna/issues/339#issue-2165685516 you wrote

This can become an issue for UI development when there are lots of users/users with a large amount of chats.

I assumed the "issue" you are referencing here is performance. If that is not the problem, could you elaborate what the issue actually is?

Mar 06 '24 12:03 pmeier

In that case I'm confused again.

GET /chats/{id} makes a request for a single chat. GET /chats makes a request for all chats. If there are a lot of chats (with a lot of bulky sources) they can be quite different.

I'm not sure if I have any more to add to this discussion (but I am still happy to open a PR if you are willing to consider the addition). I'll add an earlier message I posted here again as I think it best summarises the situation:

Ultimately, it's a QoL feature request. It's possible to get the behaviour I want with a GET to /chats, but it's not very user-friendly and has possible performance implications for anyone centrally hosting ragna.

Mar 06 '24 13:03 nenb

GET /chats/{id} makes a request for a single chat. GET /chats makes a request for all chats. If there are a lot of chats (with a lot of bulky sources) they can be quite different.

What do you mean by "quite different"? The chat content is exactly the same.

From https://github.com/Quansight/ragna/issues/339#issuecomment-1976667904

The size of this object that is sent over the network grows over time and also requires a bunch of logic on the consumer side for extracting chat IDs and chat names.

You ruled out I/O performance as an issue
What consumer logic are you talking about here? With your sample implementation in #338, you get a JSON like [{"id": "...", "name": "..."}, ...]. With GET /chats you get [{"id": "...", "metadata": {"name": "...", ...}, ...}, ...]. You don't need any logic here. This is just plain attribute access.

Unless I'm missing something, there is no QoL improvement here.

Mar 06 '24 13:03 pmeier