Specify REQ replacement behavior
I noticed that many relays don't stop processing if a new REQ is received when the relay is still querying the past. The relay ends up pushing many "EOSE"s back. This creates an issue where the client can't know which filter set was finished in each EOSE.
Something like this:
["REQ", "SUB1", { "authors": ["11"], "limit": 500 } ]. // 1
["REQ", "SUB1", { "authors": ["22"], "limit": 2 } ] // 2
["EOSE", "SUB1"] // this could be 1 or 2, since 2 is faster.
["REQ", "SUB1", { "authors": ["33"], "limit": 500 } ] // 3
["EOSE", "SUB1"] // here we don't know which it is, if 1, 2, or 3.
["EOSE", "SUB1"] // now we know all of them have finished
["REQ", "SUB1", { "authors": ["44"], "limit": 500 } ]
["EOSE", "SUB1"] // normal
If this is the desired behavior, we should specify that on NIP-01. If that is the case, I will change my lib to wait for an EOSE/Closed to finish before sending a new REQ.
The option would be to specify that if a new REQ arrives, the relay MUST kill the current process and not reply an EOSE/Closed to the older version.
To me, we need to pick a side. Leaving it undefined is a problem.
@staab @hzrd149 @tyiu how are you protecting your libs against this issue?
Simple, I don't re-use subscription IDs. Aren't subscriptions supposed to replace previous ones? So sending a new REQ implies a CLOSE for that subid?
My interpretation has always been that reusing an ID is equivalent to cancelling that subscription and starting a new one with that ID. The relay should not send an EOSE for the first subscription after it's been cancelled, either explicitly or through a new one with the same ID and it should be regarded as having a bug if it does.
Almost all relays send multiple EOSEs. I don't know why killing the current process is so hard, but maybe it's because that is not spelled out in our NIPs and devs are not even thinking in the situation.
Maybe almost all relays use the same implementation?
I agree we should specify something. I'm fine either way.
But why are you firing a ton of subscriptions and then discarding their results? This is bad, as each of these implies a database query that may be relatively expensive and a lot of bandwidth usage for both sides. And very often it's not possible to cancel a query as the operation is synchronous. Wouldn't it be more kind for you to debounce these queries and send an aggregated version only once?
Or maybe relays should instead wait a couple of seconds before actually starting the query, just to be sure it won't be canceled immediately.
Sure. I denounce. But it still happens. Many relays take multiple seconds to respond. We don't know if the relay is processing or not. Switching queries to the most updated REQ (as the user scrolls down for instance) is just natural.
But the why is irrelevant. We need to specify the ideal flow.
The same stuff happens when you send a CLOSE, and the REQ is still processing past events.
But why are you firing a ton of subscriptions and then discarding their results?
Maybe because I got the result from another relay.
Suppose you are fetching a thread (a branch from root to one specific post). The only way to do so is fetching one post at a time, since it will contain a reference to its parent. You may be querying multiple relays which may contain the events.
As soon as you receive the event from one relay, you don't need it from any other relay, but now you need its parent, so that will be a new query. You, thus, need to send a new query, possibly to the same relays, and cancel the previous one. You cannot send a request for the whole thread because you don't know the IDs of the whole thread.