wordpress-activitypub
wordpress-activitypub copied to clipboard
Feature Request: don't send Update when nothing's changed?
What
I know this is subjective, like, someone could've set up their AP content "template" to include a custom field, for instance.
But: I've been messing with an older post of mine, but I haven't actually updated its contents or anything. Yet, everytime I click "Update," all my followers' servers get sent an Update activity ...
Why
To not face issues with remote servers, and save on CPU and bandwith.
How
Could we store a hash of the post content + title? Like on Create?
Or maybe even the entire post object? Either the WP object or the JSON version.
And compare against that before sending an update?
Just thinking out loud. Then if someone had custom fields and whatnot in their JSON, it'd show a difference and the Update would get sent. But if it was a "false alert," nothing would happen ...
E.g., if maybe somewhere around here you'd check against a hash of $wp_object
minus $post_modified
and $post_modified_gmt
and maybe some other fields:
https://github.com/Automattic/wordpress-activitypub/blob/245cda8433b8319856cf3a656f5bd975ee383cfc/includes/class-activity-dispatcher.php#L66
Or extract the object from the final JSON, like somewhere around here: https://github.com/Automattic/wordpress-activitypub/blob/245cda8433b8319856cf3a656f5bd975ee383cfc/includes/class-activity-dispatcher.php#L160
Except that makes it a bit harder to extract a post or comment ID (which you'd need if the hash were stored as post or comment meta). Though caching it using set_transient()
or something would maybe even be better (and simpler), that way you could even use the object's AP ID (its URL, or a hash thereof to make sure it doesn't exceed max length) as a key.
Or in the scheduler. E.g., don't schedule if nothing changed. But then it might still schedule before you undo a previous change, for instance.
Which is a more generic issue, I think. Right now you seem to only check the following to prevent a scheduled activity from being sent after all: https://github.com/Automattic/wordpress-activitypub/blob/245cda8433b8319856cf3a656f5bd975ee383cfc/includes/class-activity-dispatcher.php#L74
But what if the post got unpublished in the meanwhile? Or made private, or ...? How does that work? Especially when using a "real" cron job rather than inline "scheduling," there's a real possibility that posts change before they get sent out.
For the default WordPress post type, I can't even press the Update button (in the Gutenberg editor), unless I have changed something, maybe its something that could also be solved even earlier.
For the default WordPress post type, I can't even press the Update button (in the Gutenberg editor)[.]
Now I wonder what I did wrong, because I definitely can. *scratches head*
To be fair, what I was referring to ("messing with an older post") was trying to get a plugin to do something with a "meta box" upon save. May be a rather rare scenario.
Also, my current workaround (I'll share later) does also skip updates when only a post's tags are edited, for instance. That I can fix though. It also isn't very efficient, but it could be improved with an extra action hook.
Now I wonder what I did wrong, because I definitely can.
Update: when I'm editing an existing page, the Update button is indeed disabled ... but for posts it's not. But on another site it is, so this could be just me :-D
Update 2: It is because I still use (some) old-style meta boxes! Also, classic editor post types will see the same.
Same here for link maintenance, this doesn't warrant toots. But maybe this is not a plugin but an Activitypub issue, where you should be able to indicate minor updates, which is visible*, but not tooted?
Same here for link maintenance, this doesn't warrant toots. But maybe this is not a plugin but an Activitypub issue, where you should be able to indicate minor updates, which is visible*, but not tooted?
I think the issue here is that when a very old post, which was never federated in the first place, is updated, it does get sent to followers' servers (as an Update, perhaps, but they'll basically see it as a new post).
I currently use a custom callback function to prevent updates when nothing's changed; it could be adapted to prevent "actively" federating older posts. (The posts would still show up when searched for, which I think is perfectly acceptable.) But it's a bit "hacky." Still need to polish/adapt to the latest plugin version/master branch.
So maybe we need to introduce a flag that indicates whether a post was federated or not?!?
What happens if I temporarily disable the AC plugin? Can I than freely deal with link rot without posts being re-federated? Seems like a hassle?
What happens if I temporarily disable the AC plugin? Can I than freely deal with link rot without posts being re-federated? Seems like a hassle?
If AP is disabled, nothing will get scheduled when saving (updating) posts, and nothing will be federated. There will not be a "reschedule" once the plugin is re-enabled.
Exactly the sought behaviour... 👍
I just updated a page, and only changed the page's author. Noticed that AP sent an update with the old author of the page, and the creation timestamp of the page (instead of the update timestamp). Should I raise a separate issue about this?
@wilenius The problem here: I do not think that Mastodon does support Author changes, so this would not be possible at all.
So maybe we need to introduce a flag that indicates whether a post was federated or not?!?
I think such a flag now exists?
Tp prevent "redundant" updates, I currently store a hash in a custom field. Only "problem" is I'm using the activitypub_safe_remote_post_response
hook, which runs once per inbox rather than once per activity. So I'm overwriting the hash a bunch of times the first time around (which is obviously not very efficient). And then for subsequent updates I check whether the post content / title / tags have actually changed by comparing a new hash with the stored one. What I'd really "need" is a hook that runs after all inboxes were posted to, but this works well enough for now.
Might be a good idea to not just store whether a post was federated but also the status per inbox. That could help with rescheduling failed requests (in case the receiving server is temporarily down, etc.). Another possibility could be to use a "proper" scheduling solution, if one exists.
@janboddez how do you currently prevent "redundant" updates? I, too, sometimes go through old posts for maintenance and don't want to spam fediverse followers with updates. Currently the only way I know to avoid this is the workaround mentioned above, to disable AP.
Relatedly, riffing on an example @pfefferle mentioned over on the WP forum, where he said:
One example: Someone writes a post, that will be boosted a lot and he changes all the links to abusive or spammy sites afterwards, no one that boosted the site would realize, only if visiting the site directly.
Actually I would say if anything that's an argument against sending an update (which might push all those spammy links to the boosted post (!). Amazing spam vector right there.
[edited to break into two comments]
I think many of us would really appreciate an AP-level setting that controls whether and when updates are federated, as well as some sensible default. Here's a proposal for such a sensible default:
- if my changes to a post do not affect the material that is pushed to the fediverse (often, excerpt, post title, tags, image), then I don't think a post update on the WP side should trigger an AP update.
Here's why: if I do link rot maintenance or add a DOI to a custom field in my blog post (both common occurrences), nothing about the fediverse post changes, and so to push an update and classify the federated post as "revised" is counterintuitive and spammy. It will over time lead people to assign low value to fediverse updates from my blog and it will cost followers.
On the other hand, if I revise the whole excerpt or upload another featured image, the federated content has changed, and so this naturally counts as a revision.
@janboddez how do you currently prevent "redundant" updates? I, too, sometimes go through old posts for maintenance and don't want to spam fediverse followers with updates. Currently the only way I know to avoid this is the workaround mentioned above, to disable AP.
I have a (super hacky) custom mu-plugin in place that stores a "hash" of the post URL, title, author, content, type, and categories and tags in a custom field. Only when any of these have actually changed, it sends an update. Unfortunately, the initial hash is calculated quite a few times over, because the plugin lacks a more appropriate "hook." But other than that it works.
Fun fact: I had to explicitly add tags to the hash because I noticed that they weren't added initially. The plugin schedules a "Create" activity before tags are applied if you use the Gutenberg editor. (See also https://github.com/Automattic/wordpress-activitypub/issues/668.)
* if my changes to a post do not affect the material that is pushed to the fediverse (often, excerpt, post title, tags, image), then I don't think a post update on the WP side should trigger an AP update.
I think we should be aware that not federating updates under such circumstances might lead to a situation where abusive behaviour of bad actors is less prevented. Updates in the AcitivityPub world also typically lead to notifications of the users that have boosted or replied to the original post.
I know that in the same way one might also argue that only federating an excerpt in the first place already opens that opportunity.
I think we should be aware that not federating updates under such circumstances might lead to a situation where abusive behaviour of bad actors is less prevented.
Can you work out the argument for me? Not sure I get it. As I note above in response to an example from @pfefferle, automatically pushing updates to federated posts is itself a considerable vector for spam or other abusive behaviour.
Say a WP post is federated and boosted a lot. If a bad actor wanted to abuse the AP plugin to, for instance, expose folks to malicious links by editing that WP post thereby triggering aan update and notification of said post, the current AP implementation is much riskier than the default I propose.
This issue is stale because it has been open 120 days with no activity. Remove stale label or comment or this will be closed in 5 days.