nips
nips copied to clipboard
Reactions are inefficient. There needs to be an aggregate kind.
Nostr needs a better way to handle likes/reactions. Nip 25 is a terribly inefficient way to do it. Burying reactions in the tags requires clients to do a huge amount of data gathering to properly show likes/reactions. Having an event kind/# that represents a tally of all likes for an event would be far more efficient, otherwise the client has to gather potentially thousands of events just to count up the likes for a note. I'll be creating a separate db table that mines and sums the reactions, but that creates a centralized solution rather than a decentralized one.
While I understand the efficiency problem, and tallying them would massively alleviate it, it breaks the trust model because clients don't trust relays.
Now, how many upvotes there are isn't something that needs high trust, IMHO, unless you want to. So here's my slightly modified proposal: We leave NIP-25 and clients continue to generate NIP-25 events, but clients that don't want to be flooded with individual reactions can instead query the relays for a total. If they instead were paranoid, they could query the kind 7 events directly and total them up for themselves.
AND while we are on the topic, if we are doing to do that for the reactions performance problem, we should also do it for the "how many followers do I have" performance problem.
My point being, these would be ADDITIONAL ways to get the data in a LESS trustworthy but much more efficient manner.
The other fly in the ointment is that no relay actually has all the data. So clients will need to somehow merge totals from multiple relays, and in doing so not double-count. That might require some advanced algebra that I'm not aware of.
Just as a data point, twitter does 3 million transactions per second to calculate and display the view count, and they can centralize and aggregate: https://www.youtube.com/watch?v=2tjCPXE0_Ko
This service seems more appropriate at the user level because aggregation at the relay level just introduces the problem of reconciliation of aggregates across relays. Rather than anoint relays with this privilege/burden, anybody could instead publish a replaceable "aggregation event" as a reply to an event. Clients could subscribe to an aggregator like any other user and for any event they've replied to with an aggregation event, the aggregation event could be used to populate the reaction counts. Subscribing to multiple aggregators could help to keep them honest because significant disagreement would raise an alarm and it's easy to audit their work.
This service seems more appropriate at the user level because aggregation at the relay level just introduces the problem of reconciliation of aggregates across relays. Rather than anoint relays with this privilege/burden, anybody could instead publish a replaceable "aggregation event" as a reply to an event. Clients could subscribe to an aggregator like any other user and for any event they've replied to with an aggregation event, the aggregation event could be used to populate the reaction counts. Subscribing to multiple aggregators could help to keep them honest because significant disagreement would raise an alarm and it's easy to audit their work.
aggregations at the user event level is useless because it's probably outdated once you get them. It's better to request a COUNT from a relay, that way it's always up to date and doesn't occupy unnecessary space. of course, that client should store and compare / merge those counts from multiple relays and update it's own count number and display that. I wouldn't trust numbers from a pubkey except maybe some specialized ones that did that as a service, but they'd be fetching that data from somewhere somehow, and the best way to do that would be with a COUNT from relays.
there could also be a NIP that implements getting stats for a set of ids via REST:
POST ids -> /stats
[ {id: "abcd...", replies: 12, reactions: 100, reports: 100, quoted_boosts: 10},
...
]
this could be very efficient and could aggregate many relays, have caching, etc.
there could also be a NIP that implements getting stats for a set of ids via REST:
POST ids -> /stats
[ {id: "abcd...", replies: 12, reactions: 100, reports: 100, quoted_boosts: 10}, ... ]
+1 Even better
aggregations at the user event level is useless because it's probably outdated once you get them
The idea is that the aggregation event is replaceable (NIP-33) and periodically updated by the aggregator who receives all reaction events from all relays.
I wouldn't trust numbers from a pubkey except maybe some specialized ones that did that as a service
The expectation is that a set of trustworthy aggregators would emerge because they effectively keep each other honest.
there could also be a NIP that implements getting stats for a set of ids via REST
Isn't nostr the natural place to publish these results? As an aggregator, I'd prefer to not have to run a web server, and nostr relays are already set up to do the work of serving this kind of data. Clients would also automatically get updated stats over the websocket which seems nice.
I would like it if there was a solution to provide information over the existing websocket rather than having to additionally using REST as it adds more complexity (even tho pretty simple) into my application.
My naive approach would be to have additional properties added onto the event JSON which can be ignored when doing verification. If the server already has the data in memory (or a fast data store) it doesn't take too much to simply injected the extra data points which could have their own NIPs.
Another approach would be a new kind of REQ method for data points that don't follow the event structure and are purely for servicing means.
Aggregation data changes constantly so HTTP caching would be temporary at most. Still facing all the same problems described here, but allows the developer to continue working off the existing API approach. But perhaps this breaks things and I don't know what I'm talking about as my studying on this is pretty minimal.
My naive approach would be to have additional properties added onto the event JSON which can be ignored when doing verification. If the server already has the data in memory (or a fast data store) it doesn't take too much to simply injected the extra data points which could have their own NIPs.
This is my preferred approach.
yes this is fine
My naive approach would be to have additional properties added onto the event JSON which can be ignored when doing verification. If the server already has the data in memory (or a fast data store) it doesn't take too much to simply injected the extra data points which could have their own NIPs.
This is my preferred approach.
@fiatjaf Do we need a NIP to do this? Or can we agree on a field name for tallies? Also, should relays always include rallies or should there be a filter for this?
@Cameri Of course you can implement it without a NIP first. Clients might get confused by extra fields in events though. To then use that field, we definitely should nip it. Nips are how we agree on things. I would only add the extra field on events that have reactions and as it's not for free to add this, you probably want to control it somehow. Some way of either switching it on for a connection or a new command like REQ_WITH_REACTIONS ...
@lidstrom83 while tallying by non-relays would be nice, it's very costly. Such a service would have to send an event for every :+1: and using nip33, the relay would save disk space but pay for it with CPU and disk i/o. Replacing events is relatively expensive.
@Cameri Of course you can implement it without a NIP first. Clients might get confused by extra fields in events though. To then use that field, we definitely should nip it. Nips are how we agree on things. I would only add the extra field on events that have reactions and as it's not for free to add this, you probably want to control it somehow. Some way of either switching it on for a connection or a new command like
REQ_WITH_REACTIONS...@lidstrom83 while tallying by non-relays would be nice, it's very costly. Such a service would have to send an event for every :+1: and using nip33, the relay would save disk space but pay for it with CPU and disk i/o. Replacing events is relatively expensive.
{
...
tallies: {
"❤️": 69,
"🤙": 420
},
...
}
these tallies, are they global? like, is the relay counting from all events it has? or is this derived from the request?
{ ... tallies: { "❤️": 69, "🤙": 420 }, ... }
these tallies, are they global? like, is the relay counting from all events it has? or is this derived from the request?
these tallies, are they global? like, is the relay counting from all events it has? or is this derived from the request?
{
...
tallies: {
"❤️": 69,"🤙": 420},
...
}
these tallies, are they global? like, is the relay counting from all events it has? or is this derived from the request?
Not sure what you mean by global. This is an example of how tallies could look like on an event.
Not sure what you mean by global.
those numbers are coming from all reactions the relay has seen or from a list of authors on the request?
Not sure what you mean by global.
those numbers are coming from all reactions the relay has seen or from a list of authors on the request?
They came from me typing them. We don't have any of this implemented anywhere nor do I know the answer on how it's going to work.
new aggregation requirement: zaps with amounts counted from the bolt11. not sure if there's a general way to implement this, I will need custom magic regardless it seems like. especially since these are only counted if the zaps come from a zapper pubkey associated with a user's lnurl. for now I will continue pulling all of the zaps and count them client side.
What are zaps?
new aggregation requirement: zaps with amounts counted from the bolt11. not sure if there's a general way to implement this, I will need custom magic regardless it seems like. especially since these are only counted if the zaps come from a zapper pubkey associated with a user's lnurl. for now I will continue pulling all of the zaps and count them client side.
What does a single zap event look like?
@Cameri how feasible are these tallies for you to implement? Would you keep a separate cache for them somewhere?
@scsibug @atdixon how do you see this?
@Cameri how feasible are these tallies for you to implement? Would you keep a separate cache for them somewhere?
@scsibug @atdixon how do you see this?
I can cache reactions as they come in using redis and include them as events go out. But if clients don't request the event they won't see updates, so part of the deduplication logic would be to keep the highest tallies or something. Maybe we need to include a timestamp of the tally?
@Cameri how feasible are these tallies for you to implement? Would you keep a separate cache for them somewhere?
@scsibug @atdixon how do you see this?
I can cache reactions as they come in using redis and include them as events go out. But if clients don't request the event they won't see updates, so part of the deduplication logic would be to keep the highest tallies or something. Maybe we need to include a timestamp of the tally?
We could also just send unsigned events with the tallies.
What are zaps?
the kind 9735 thing I posted in telegram awhile back. still need to write it up but damus and snort already implement it.
Serving each individual like is definitely not great / ultimately untenable. So some aggregation solution is going to be warranted.
In any case, I'd make it a separate request (instead of tacking onto events) so that tallies can be refreshed w/o re-fetching events, and separate band keeps things simple.
But - controversial opinion maybe? - I don't think relays should take this on or have it in nips/spec. There are too many issues ... how does a client sensibly merge tallies across relays? what do "paranoid" clients do...the most skeptical tallies are going to be high-volume reactions...so does client go back and request them all? As far as implementation goes, it's not trivial...counters always suck and here you have to de-dupe extra likes from the same pubkeys? ...
Why not consider this a 3rd party provider thing? A simple focused tally service and given clients can integrate with their preferred service choice and forward on tips or what-have-you. (Think of tally service more like search service this way, not something in spec/nips ... or if these are, make them separate specs from relay specs and not websockets, which are not fitted to search-me-a-thing or refresh-me-some-reaction-tallies.)
Not sure what you mean by global.
those numbers are coming from all reactions the relay has seen or from a list of authors on the request?
I think people want to see a count of all reactions on the relay, not filtered by the filter except insomuch as being applied to the event which is filtered by the filter. But that's just what I think.
Not sure what you mean by global.
those numbers are coming from all reactions the relay has seen or from a list of authors on the request?
I think people want to see a count of all reactions on the relay, not filtered by the filter except insomuch as being applied to the event which is filtered by the filter. But that's just what I think.
in that case, that feature will be useless (for me at least) once people simply autobot those counts with multiple pubkey farms. I don't think it should be encouraged to return such counts without beeing given some filter, but that's just my opinion of course.. I understand that it would alleviate much pains... maybe I need to rethink how reactions are meant to be used if that's the way it's supposed to work.
how does a client sensibly merge tallies across relays? what do "paranoid" clients do.
It might be more data than simple counts, but something like this could be merged across relays and is still less data than pulling each reaction event. I'm not sure about it, I'm just throwing it out there.
[
"❤️": ["pubkey1", "pubkey2", ..., "pubkeyN"]
"🤙": ["more","pub","keys","that","are only","valid if in","just one","reaction"]
]
how does a client sensibly merge tallies across relays? what do "paranoid" clients do.
It might be more data than simple counts, but something like this could be merged across relays and is still less data than pulling each reaction event. I'm not sure about it, I'm just throwing it out there.
[ "❤️": ["pubkey1", "pubkey2", ..., "pubkeyN"] "🤙": ["more","pub","keys","that","are only","valid if in","just one","reaction"] ]
this would return multiple arrays of millions of pubkeys at some point