activitypub
activitypub copied to clipboard
Undo activity without an ID on the object activity
Is it possible to process an Undo activity without an ID on the object activity? For example, to undo a Like
activity:
{
"@context": "https://w3c.org/ns/activitystreams",
"actor": "https:/example.com/actor/7",
"type": "Undo",
"id": "https://example.com/activity/undo/4",
"object": {
"type": "Like",
"object": {
"id": "https://example.org/note/17",
"type": "Note"
}
}
}
If the likes
collection of the object is unique by actor
, it is possible to execute the side effect of this activity without needing the id of the Like
.
Similarly, for undoing a Follow
activity:
{
"@context": "https://w3c.org/ns/activitystreams",
"actor": "https:/example.com/actor/19",
"type": "Undo",
"id": "https://example.com/activity/undo/5",
"object": {
"type": "Follow",
"object": {
"id": "https://example.org/actor/256",
"type": "Person"
}
}
}
If the followers
collection of the actor that is being unfollowed is unique by id
, then it should possible to execute the side effect of removing the Undo activity's actor from the followers
collection without needed the exact id of the Follow
activity.
I think the spec (or recommendations) should just require IDs, and not allow unfollows by actor/shape.
Accept
and Reject
may also have the same issue, since you can accept or reject an activity, and that activity may be unique by object id, or actor id, not necessarily by activity id.
The problem occurs when the object
is a Like
or a Follow
, which has been decided to be idempotent. it's not unique to Undo
, as it also happens with Accept/Reject Follow
and Undo Accept Follow
.
EDIT: it happens for the following "stateful" activities:
- Follow
- Block
- Like (per earlier discussion)
- Announce? (see conversation below)
@trwnh I think you're trying to suggest that there was a recent decision to make the elements of following
, followers
, liked
, likes
, shares
, etc. unique? That is not the case; it's implied in the original documents, and most implementations treat them this way. It's good for us to highlight these hidden assumptions and provide guidance on them, but I don't think it's fair to say there was a new decision on this topic.
I meant with the triage of #381 it was decided to make the implication explicit, at least within the primer. Not necessarily that a new decision was made, but a decision was made at some point, and that decision was clarified during the issue triage yesterday.
Although I have to raise another point of contention: when was it decided that Announce
should be idempotent with respect to shares
? This is starting to feel like implementation decisions are being raised into the spec itself. For example, if Tumblr decides to implement reblogs as Announce
then they will want to support multiple shares
most likely. And the original intent with #381 was to clarify cases where a "like" is a kind of post, not a binary signal. This is also why I brought up in #381 that there should be more clarity on the split between a "resource" and a "procedure".
Taking Follow
as an example, if we say that following is idempotent and id
doesn't matter (because implementers don't want to keep track of the ids of every single Follow and Accept Follow in order to Undo it), then we are basically saying the Follow activity as a resource doesn't matter. All that matters is the procedure. And this sort of makes sense considering how unwieldy it is to handle an activity such as Undo Accept Follow Person, which contains three nested objects and two of the ids don't matter. This led to Mastodon deciding to send a Reject Follow Person instead, where the id of the Follow still doesn't matter. But then you have cases like https://github.com/misskey-dev/misskey/issues/9250 / https://github.com/misskey-dev/misskey/issues/11015 where Misskey assumes that ids are required, so they send a local id because they don't have the original id... and Pleroma thinks that the id is then invalid, despite not really caring for the id of the resource. So either the spec should account for this, by explicitly saying the id of the Follow doesn't need to be stored, and/or that all implementations MUST treat all idempotent actions as null id... or some other activity needs to be defined for doing the same thing. This is again what I meant by describing the "shape" of the activity -- by inlining or embedding the Like or Follow, you don't need an id. Explicitly declaring this would prevent cases like the "remove follower" bug which remains an issue to this day -- Misskey would have no need to send a local id, and Pleroma would have no need to validate the id at all.
Although I have to raise another point of contention: when was it decided that Announce should be idempotent with respect to shares? This is starting to feel like implementation decisions are being raised into the spec itself. For example, if Tumblr decides to implement reblogs as Announce then they will want to support multiple shares most likely.
@trwnh Idempotency is generally a desirable property in a distributed system. The connection between servers could be flaky and the sender may see a request timeout and retry activity delivery not knowing that recipient already processed it.
I think for multiple shares it would be better to add a property specifying the total number of shares to Announce
activity and maybe send Update(Announce)
activity after each share.
@silverpill a primary characteristic of Tumblr style reblogs is that they continue to exist after reblogging again, each with their own timestamp and permalink. In IndieWeb this would be possible due to how u-repost-of
and h-cite
work. Multiple reposts can be tracked naturally because each repost is a resource with a URI. Update Announce
would not work because shares
continues to have only one entry.
Idempotency generally happens as a property of having some idempotency key, in this case the id
of the activity. The question is whether we should enforce additional spec-wide uniqueness constraints that may or may not have been implied initially -- and in doing so, we need to determine how to deal with the fact that id
no longer matters entirely. In the case of Follow
, this seems mostly uncontroversial, although it creates a bit of a tricky state machine; in the case of Like
, this seems a bit controversial but popular opinion tends to agree, especially due to the existence of the liked
collection and how it contains objects instead of activities, implying an intent of uniqueness; in the case of Announce
this is a bit more controversial. Already there are UX annoyances around boosting a status multiple times in Mastodon, for example, with having to unboost before boosting again, and the common API error "422 reblog of status already taken" that happens if you do this too quickly.
I think the best resolution for this problem is guidance within the Primer for referring to activities. We can apply the Postel principle and say that for best interoperability, the publisher should include the ID, and the consumer should accept activities without IDs if they can identify them uniquely another way.
I'll add this section to the Primer, and we can consider from there.
https://www.w3.org/wiki/ActivityPub/Primer/Referring_to_activities
We agreed during triage that it does make sense for consumers to make an effort to identify activities with no id if other attributes uniquely identify them. We had some disagreement about whether consumers should identify activities with an id property that hasn't been seen before, possibly because the publisher did not store the id it used originally. So, the guideline there is somewhat weaker.
@evanp From Primer:
It is helpful to have an id property for these referred activities, but not all systems will store the activity ids over time. This can make it difficult to refer to those activities later.
Any examples of such systems? In my experience, the majority of AP implementations have no problem with including referred activity ID in Undo, Accept and Reject. Yes, it is possible to identify referred activities by shape, but I think this should not be a recommendation.
In particular, this guideline should not be applied to Undo, Accept and Reject:
Consumers should make an effort to process activities without an id if those activities can be identified uniquely by other properties. See below for some ways to identify different types of activities.
Because that would shift the burden from a few buggy implementations to everyone else.
Postel principle is not universal. It should be used carefully as its application leads to protocol decay: https://datatracker.ietf.org/doc/html/draft-thomson-postel-was-wrong-00
I suggest the following changes to Primer:
- State that implementations MUST store IDs of Like, Follow and Announce activities.
- Change guideline to Consumers MAY make an effort to process activities without an id if those activities can be identified uniquely by other properties. See below for some ways to identify different types of activities.
@silverpill I think Misskey and Mastodon have come up as examples that don't keep persistent activity IDs for some activities.
In terms of the guidelines, they're helpful tips for what developers should do to maximize interoperability. They're not normative requirements; there's no MUST possible.
If you want to maximize interoperability, the Postel principle is a good one to follow.
I agree that publishers should include IDs for maximum interop, but also that consumers should try to handle ID-less activities. If both sides do some extra work, we can avoid interop failures.
Activity Streams 2.0 does not require an ID for pretty much anything; ActivityPub doesn't require it for referenced activities. In the absence of an ID, the guidelines hold.
@trwnh I'm moving the conversation about multiple Announce activities to another issue.
@silverpill another good example might be for C2S! A client that wants to let a user undo a follow needs to first find the object in following
, then scroll through the outbox
until they find a matching Follow
activity. That might require many, many pages! Given that the server has all the information locally, it seems easier for all parties to leave it up to the server to determine which activity to Undo
.
I think Misskey and Mastodon have come up as examples that don't keep persistent activity IDs for some activities.
@evanp I have data on Mastodon and it uses the same activity ID in Follow
and Undo(Follow)
. It also uses correct ID of remote Follow
activity in Accept(Follow)
. I don't have data on Misskey, but it probably does the right thing too (otherwise someone would have noticed).
Perhaps these implementations didn't store activity IDs in the past? I don't know.
In terms of the guidelines, they're helpful tips for what developers should do to maximize interoperability. They're not normative requirements; there's no MUST possible.
What about "should store IDs of Like, Follow and Announce activities"? The Primer is authoritative and developers will follow its recommendations. Maximizing interoperability at all costs would mean accounting for quirks of 50+ different implementations, this is not sustainable. Reduction of technical debt and interop improvements should go hand in hand.
another good example might be for C2S!
Quite possible. There might be cases where identification by shape makes the most sense. But for Undo/Accept/Reject in S2S context this is not necessary because using IDs is a de-facto standard (even if the spec doesn't require them).
This issue has been mentioned on SocialHub. There might be relevant details there:
https://socialhub.activitypub.rocks/t/follow-accept-object-identifiers/3851/9
This issue has been mentioned on SocialHub. There might be relevant details there:
https://socialhub.activitypub.rocks/t/an-fep-for-follow-accept-mechanics/4266/5
I suggest the following changes to Primer:
* State that implementations MUST store IDs of Like, Follow and Announce activities. * Change guideline to _Consumers **MAY** make an effort to process activities without an id if those activities can be identified uniquely by other properties. See below for some ways to identify different types of activities._
I think that's a fair resolutions. I changed it from "should" to "may". It probably also makes sense in general for us to avoid SHOULD, MUST, MAY, and other RFC 2119 terms in the Primer, since it is helpful guidance, not normative specification.
I'm going to leave this as pending closure until we hear back from @silverpill .
@evanp Thank you for making this change.
Also +1 for avoiding RFC-2119 key words.
I might suggest a couple of small revisions to more clearly and fully avoid RFC-2119 keywords...
State that it is mandatory that implementations store IDs of Like, Follow, and Announce activities.
Change guideline to Consumers CAN make an effort to process activities without an ID if those activities can be identified uniquely by other properties. See below for some ways to identify different types of activities.
(I know, lowercase may
wouldn't be strictly RFC-2119, but lowercase keywords are often read by humans as if they were uppercase, so I find it's best to be case-insensitive about avoiding those keywords.)
i'll assume that @silverpill is ok with even less normative language of "can", so since objections are resolved, ill close this issue.