wordpress-webmention
wordpress-webmention copied to clipboard
Better Spam and Moderation Handling
There is a request for better handling in this area. A lot of people have commented that Akismet is marking webmentions as spam.
I'm opening this issue to brainstorm other idea in this area. For example...
- Rearrange position of and ass parameters to $webmention_comment_approve in order to allow functions to auto-approve with more discretion. $comment_approved = apply_filters( 'webmention_comment_approve', WEBMENTION_COMMENT_APPROVE );
- Akismet has a pre_check_pingback function(https://plugins.svn.wordpress.org/akismet/trunk/class.akismet.php) that runs earlier than its other approvals. See if we can write something similar that will enable if Akismet is running.
- Whitelist against the 'previously approved comment' setting in WordPress. There is a core suggestion about this for pingbacks: https://core.trac.wordpress.org/ticket/24241 . It already has a suggested implementation we could adopt for this...if there was an approved webmention from that domain previously, approve subsequent ones.
Re 3, what's the definition of "domain" here? I wouldn't want to pre-approve any e.g. Twitter (i.e. twitter.com) like or retweet webmention, only the ones for specific users (i.e. twitter.com/user123). Reason here is that occasionally I do see likes/retweets from accounts like "Buy f0ll0wers" and the like, which for obvious reasons I don't approve.
In this case, Bridgy would be the domain.
Which wouldn't be ideal, as that would cause the same problem with spam likes, retweets etc
Semantic Linkbacks gets the actual author URL from Twitter, for example, and stores it. That may be an option to discuss for that plugin or to build in mind in this one.
The last set of commits added the ip but not the user agent.
Yeah sorry I just saw that wp_new_comment should handle the IP and user-agent by itself and removed my comment.
Yes, but the biggest issue is that Akismet runs before Semantic Linkbacks, therefore it doesn't have all possible data yet.
A bunch of the changes coming down the pike should address that order. I had the same issue with comment notification emails.
@dshanske but it still should set based on proper defaults. $_SERVER['USER_AGENT'] and $_SERVER['REMOTE_ADDR'] (and it's various forward friends) will be available at the time wp_new_comment fires in class-webmention-receiver.php?
Perhaps Akismet is still learning how to process "webmention" types (some types get handled differently) and needs our help, it does learn. Every time you approve or mark a comment as spam it should post to the /submit-ham or /submit-spam endpoints in their API.
The standard "comment" the plugin is generating is kind of spammy content too:
'This %s was mentioned on <a href="%s">%s</a>' (obviously this is filtered and adjusted by other plugins, but it's too late by then)
Maybe setting that href to rel="nofollow" would aid in it's favour? The spec even suggests this and if a link is displayed back to the source, SHOULD link to source with rel="nofollow" to prevent spam.
I emailed Akismet. They are willing to take a look if we give them the API keys so they can check the logs.
It isn't just Akismet though...we need better options overall to control this.
I've recently implemented @dshanske's option 3 on a friend's site, detailed here: https://gregorlove.com/2017/12/one-of-the-known-issues/ and in the subsequent note. I hadn't seen this issue, but he directed me to comment here with my thoughts.
I'm not a daily WordPress user myself [so take with a grain of salt :)], but I think this is a good baseline functionality to include in the plugin. I would expect if I had checked “Comment author must have a previously approved comment” that it would apply to webmentions as well. I think it's a better user experience for indieweb site webmentions. Tweet spam is a concern to be addressed, but personally I prioritize the indieweb mentions as the baseline.
David mentioned he was working on a PR to whitelist comment domains. I haven't seen the code so don't know the details, but perhaps it could work as an additional layer on top of this.
[Hypothesizing]
- Enter a domain name that requires additional validation against the whitelist (
https://twitter.com) - In another field enter individual URLs under that domain that are whitelisted (
https://twitter.com/gRegorLove) - When a comment has a
semantic_linkbacks_canonicalstarting with a domain in step 1- Override my snippet above to check if the
semantic_linkbacks_canonicalstarts with one of the URLs in step 2
- Override my snippet above to check if the
Twitter is the immediate example here, but it applies to any domain that has multiple profiles. This would work the same way to whitelist mentions from example.com/janedoe but not from example.com/johndoe.
Related: https://github.com/pfefferle/wordpress-semantic-linkbacks/issues/38
Bumping this old issue, I'm noticing it a lot more now. I have this in my functions.php to prevent webmentions from ever getting marked as spam, but evidently it no longer works with 5.x. I'd vote to make this an option or even just do it for all users, but that's just me. 😁
function unspam_webmentions($approved, $commentdata) {
return ($commentdata['comment_type'] == 'webmention' ||
get_comment_meta($commentdata['comment_ID'], 'semantic_linkbacks_type', true)
) ? 1 : $approved;
}
add_filter('pre_comment_approved', 'unspam_webmentions', '99', 2);
Needs an update of the old thought I think.. but the overriding theme... updating the moderation handling I agree