ircv3-specifications ratify CHATHISTORY

Current implementations:

Servers:

[x] Oragono
[x] UnrealIRCd

Clients:

[ ] Kiwi (modulo kiwiirc/kiwiirc#1616 and kiwiirc/kiwiirc#1397)
[x] Gamja
[x] Goguma
[x] Senpai

Bouncers (as server):

[ ] Soju (does not implement the * target; open issue)
[ ] Kiwibnc (probably with some bugs / incompatibilities)

Networks:

https://testnet.oragono.io

Where to: I'd love to see this in a desktop and a mobile client. It would also be useful to have this supported by a server implementation that currently buffers history for autoreplay on channel join: requesting the chathistory cap would suppress the auto-playback, and then CHATHISTORY queries could be answered from the buffer.

Feb 24 '21 18:02 slingamn

It would also be useful to have this supported by a server implementation that currently buffers history for autoreplay on channel join: requesting the chathistory cap would suppress the auto-playback, and then CHATHISTORY queries could be answered from the buffer.

FWIW, soju does exactly this. It's a bouncer though.

Feb 24 '21 18:02 emersion

@kylef notes an incompatibility with the chathistory batch spec, which seemingly forbids the use of the * target:

This parameter contains the target of messages within the batch. The target MUST either be the nick of the remote client for private messages or the name of a channel which the local client is in for public messages.

So if possible, we should amend that spec?

Feb 24 '21 19:02 slingamn

A couple of questions that remain unanswered from the initial draft PR discussion (https://github.com/ircv3/ircv3-specifications/pull/393#issuecomment-721265459)

Is there a reason TAGMSG is only included with event-playback?
Would e.g. chathistory-events be a better name than event-playback?

Feb 27 '21 19:02 jwheare

I had some skepticism about this rationale:

There's no risk of TAGMSG causing a state related desync, they are by design lossy, so can't communicate important consistent state.

How is it explicit in the design of TAGMSG that they are lossy? For example, naive implementations of +draft/react would cause a local state change.

As for the name --- what would be the rationale for changing it? To clarify that it's a sub-capability of chathistory?

Feb 28 '21 00:02 slingamn

They’re lossy because only some clients see them. And they carry no more state than a privmsg. It’s not like a quit or join that affects the channel member list.

Feb 28 '21 06:02 jwheare

The rationale for changing the events is that the original one doesn’t match the base name. It’s not chat-playback. draft stage means we can change stuff to be better before it becomes permanent.

Feb 28 '21 06:02 jwheare

I don't have very strong feelings about this either way. I might wait for other people to opine?

Mar 01 '21 18:03 slingamn

IMHO it makes sense to add the prefix, it makes it clearer that the cap belongs to chathistory. Maybe we'll have another spec in the future where "event playback" could mean something else.

Mar 01 '21 18:03 emersion

re. amending the chathistory batch type, I forgot our actual "solution" to this: for the * target, we send the user's own nickname as the batch parameter. I can still see a case for amending the other spec to allow * there.

Mar 02 '21 00:03 slingamn

From discussion, it should be OK for the draft name to remain draft/event-playback, and then for the ratified name to be chathistory-events.

Mar 09 '21 19:03 slingamn

This is additionally blocked on https://github.com/ircv3/ircv3-specifications/pull/450. I plan to take that PR out of draft in a few days, once Oragono 2.6.0 is published.

Apr 14 '21 12:04 slingamn

I have implemented CHATHISTORY in UnrealIRCd according to the current draft specification, it will be in UnrealIRCd 5.2.0. As for the CAPs: it implements draft/chathistory but not draft/event-playback. I could live with pretty much every aspect of the specification so nice work on that! :+1: I think I hit 1 or 2 small issues, maybe I should dig up my notes on that but.. they may not even be worth mentioning.

irc2.unrealircd.org is online with the current implementation (running UnrealIRCd from git) and can be used by clients for testing. Things are undertested atm as I still need to add CHATHISTORY to our test framework. There's 1 small "known issue" with CHATHISTORY TARGETS not returning a sorted list yet, but that will be fixed before release.

Jun 04 '21 08:06 syzop

I've ran into an issue with unrealircd and gamja cooperation. The server has a setting to enable history on certain channels, not storing any for the rest of them. Gamja requests CHATHISTORY for any joined channel (i have not read the raw log), and if it happens to be the one with history not enabled, unrealircd sends FAIL INVALID_TARGET. In response, the client displays error messages (in two places) and gives up on any further requests (for different channels). I think this case should be defined in documentation, and my suggestion is to make the IRCd act as there's no history stored, without the hard FAIL error. INVALID_TARGET is defined by the current spec as either no permissions for the target, or a non-existing target. None of these is true here.

Jul 10 '21 21:07 k4bek4be

CHATHISTORY TARGETS should solve this issue when Gamja will implement it, right?

Jul 10 '21 21:07 progval

gamja already uses CHATHISTORY TARGETS instead of requesting history on JOIN: https://git.sr.ht/~emersion/gamja/commit/91208a6d47d4ca5d60cad21a41018763585a2209

Jul 10 '21 22:07 emersion

Ah, but if you JOIN a channel then switch to its tab, gamja will try to request history for it to populate the scrollback. CHATHISTORY TARGETS won't help here.

Jul 10 '21 23:07 emersion

CHATHISTORY TARGETS won't offer (complete) help here. Someone can always join a channel later. It's fine if someone sends CHATHISTORY TARGETS after a (re)connect, before or after the mass re-join. But I hope nobody is suggestion to send CHATHISTORY TARGETS on every and each later JOIN that happens in a session. The TARGETS subcommand requires a lot of IRCd resources, possibly the most resources of any IRC command out there, and on the IRCd side I am likely to throttle it a bit extra because of that.

Personally, if I was a client coder, I would just try to fetch history of every channel that I join, and deal gracefully with any error conditions.

I see CHATHISTORY TARGETS mainly being useful for missed PM's and such, but implemented it anyway in UnrealIRCd.

Jul 11 '21 07:07 syzop

deal gracefully with any error conditions

Right. But I wonder what "deal gracefully" means with the existing spec. INVALID_TARGET can be a "real" error as well as a "history disabled" error. I'd rather have an explicit error code for "history disabled", so that I can properly inform the user what happened, without making it look like an error.

Jul 11 '21 07:07 emersion

INVALID_TARGET can be a "real" error

This should be covered by MESSAGE_ERROR

Jul 11 '21 08:07 progval

That's my understanding as well; MESSAGE_ERROR is a server error (like an HTTP 50x) and INVALID_TARGET is the server reporting that history is not available (covering the same cases, as, e.g., 404 Not Found, 401 Unauthorized).

INVALID_TARGET should never be straightforwardly retryable (you'd expect to get the same error again) but it's possible that it's recoverable in some other way? If so (and if there are multiple distinct ways of recovering) there might be a case for separate error codes, but right now I don't see the use case.

Jul 12 '21 12:07 slingamn

About retrying, i see one case: the user joined a channel when history was not enabled, left, then someone enabled the history. The user should receive it after re-joining. There's also no reason to stop asking for other targets when one fails (the server should maybe use CAP DEL if the history has somehow failed completely). I don't know why gamja does it, can't find any reason browsing the source, so this may as well be a bug.

Jul 12 '21 13:07 k4bek4be

Something else that came up again is: how should clients figure out that they've reached the end of the chat history (ie, there are no more messages to fetch)?

Some servers might want to filter out messages if the user doesn't have the permission to see them.

gamja currently assumes that if the server returns less messages than the limit, there are no more messages. It would be possible to change this logic to something like zero messages returned means no more history, but in pathological cases (the full page is filtered out by the server) this still breaks. msgid and timestamp based heuristics are error-prone.

References:

https://lists.sr.ht/~emersion/public-inbox/%3C20211023093721.1412757-1-progval%2Bgit%40progval.net%3E#%3CmEWJGZhApTCkyrgk3DE1L2FQ-QLERx0ACqqYjKJgUntqRUN326VNLaA5P1TqeOedjjYphtb5L91KlhCO4dSX5RSNNEA3aMfHsvorQdmy1Zg=@emersion.fr%3E
https://github.com/ergochat/ergo/issues/1676

Nov 02 '21 18:11 emersion

There was a small discussion on this in the draft PR:

Starts here https://github.com/ircv3/ircv3-specifications/pull/393#issuecomment-721265459

Is there scope for a way to indicate that you've reached the end of history either forward or backwards, or are we content for clients to just keep requesting more until they stop seeing new messages?

Nov 02 '21 21:11 jwheare

There are two slightly different issues here:

(a) If the client receives fewer messages than the requested limit, can the client assume that there are no more messages available? (b) Assuming the answer to (a) is "yes", the case where the paging window is full is still ambiguous: the paging window could be full even though there are no more messages available. Should we amend the spec to add an explicit indication of whether there are more messages available?

I'm leaning towards yes on (a) and no on (b). If other people feel similarly, I'll amend the spec to make (a) explicit.

Nov 02 '21 22:11 slingamn

Sorry, I think I neglected two open issues on this thread. The first is the resource cost of CHATHISTORY TARGETS:

The TARGETS subcommand requires a lot of IRCd resources, possibly the most resources of any IRC command out there, and on the IRCd side I am likely to throttle it a bit extra because of that.

It seems to me that CHATHISTORY TARGETS should have a cost comparable to LIST. What makes it so expensive?

The other is the incompatibility between gamja and unrealircd. Is this still an issue? What can be done to resolve it?

Nov 03 '21 06:11 slingamn

To clarify, the intent of TARGETS was that it would typically list only the channels that the user is currently joined to, which should make it significantly cheaper than LIST.

Nov 03 '21 07:11 slingamn

Sorry, I think I neglected two open issues on this thread. The first is the resource cost of CHATHISTORY TARGETS:

The TARGETS subcommand requires a lot of IRCd resources, possibly the most resources of any IRC command out there, and on the IRCd side I am likely to throttle it a bit extra because of that.

It seems to me that CHATHISTORY TARGETS should have a cost comparable to LIST. What makes it so expensive?

The thing is that, for each channel that the user is in, it needs to do quite a bit of processing (per channel!) due to the two arguments that TARGETS supports (start and end timestamp) and the need to return something based on that. Due to the timestamps I basically have to travel line by line through the chat history of a channel (can stop on the first match though). And the thing is with TARGETS we need to do that processing * xx channels. In UnrealIRCd we only have to deal with channels and not with PM's, so that maximizes it at max per channels which is usually 10. I just benchmarked it with a few 500+ lines history channels and calculated what it would be if we had 10 such channels. It takes about 2,5msec then, so we can handle 400 of those commands per second per server. That's something we can handle, although I will probably impose a bit more penalty/ratelimit than we have now. But yeah, if you have a server implementation that would (also) store dozens of PM's, and thus needing to traverse like 100 of targets, and/or if they would not be in fast memory like in UnrealIRCd but in say.. SQL or something.. then yeah... I think you can see the problem :D

Nov 10 '21 08:11 syzop

The other is the incompatibility between gamja and unrealircd. Is this still an issue? What can be done to resolve it?

I kinda lost track of this one. What do you want me to send when history is not enabled for a channel (or any target)? Let me know and if this is different than what we send now I will be happy to update UnrealIRCd.

Nov 10 '21 08:11 syzop

The other is the incompatibility between gamja and unrealircd. Is this still an issue? What can be done to resolve it?

gamja used to completely stop fetching history when a single CHATHISTORY query failed. This was a gamja bug but it's now fixed.

gamja will still print an error message to the user if the server sends FAIL INVALID_TARGET.

Nov 10 '21 10:11 emersion

The thing is that, for each channel that the user is in, it needs to do quite a bit of processing (per channel!) due to the two arguments that TARGETS supports (start and end timestamp) and the need to return something based on that. Due to the timestamps I basically have to travel line by line through the chat history of a channel (can stop on the first match though). And the thing is with TARGETS we need to do that processing * xx channels.

The intent of the specification was that the selectors would only match the time of the latest message sent in the channel:

ordered by the time of the latest message in the channel history or direct message conversation

This probably needs to be clarified explicitly in the spec, but this is the intended meaning and the behavior implemented in Ergo --- the selection parameters in TARGETS are merely pagination parameters with respect to the ordering of the targets by the time of the latest message sent.

Nov 10 '21 20:11 slingamn

ircv3-specifications ircv3-specifications copied to clipboard

ratify CHATHISTORY

ircv3-specifications
ircv3-specifications copied to clipboard