wordpress-webmention icon indicating copy to clipboard operation
wordpress-webmention copied to clipboard

Better Spam and Moderation Handling

Open dshanske opened this issue 9 years ago • 14 comments

There is a request for better handling in this area. A lot of people have commented that Akismet is marking webmentions as spam.

I'm opening this issue to brainstorm other idea in this area. For example...

  1. Rearrange position of and ass parameters to $webmention_comment_approve in order to allow functions to auto-approve with more discretion. $comment_approved = apply_filters( 'webmention_comment_approve', WEBMENTION_COMMENT_APPROVE );
  2. Akismet has a pre_check_pingback function(https://plugins.svn.wordpress.org/akismet/trunk/class.akismet.php) that runs earlier than its other approvals. See if we can write something similar that will enable if Akismet is running.
  3. Whitelist against the 'previously approved comment' setting in WordPress. There is a core suggestion about this for pingbacks: https://core.trac.wordpress.org/ticket/24241 . It already has a suggested implementation we could adopt for this...if there was an approved webmention from that domain previously, approve subsequent ones.

dshanske avatar Jun 12 '16 17:06 dshanske

Re 3, what's the definition of "domain" here? I wouldn't want to pre-approve any e.g. Twitter (i.e. twitter.com) like or retweet webmention, only the ones for specific users (i.e. twitter.com/user123). Reason here is that occasionally I do see likes/retweets from accounts like "Buy f0ll0wers" and the like, which for obvious reasons I don't approve.

armingrewe avatar Jun 13 '16 19:06 armingrewe

In this case, Bridgy would be the domain.

dshanske avatar Jun 13 '16 19:06 dshanske

Which wouldn't be ideal, as that would cause the same problem with spam likes, retweets etc

armingrewe avatar Jun 13 '16 19:06 armingrewe

Semantic Linkbacks gets the actual author URL from Twitter, for example, and stores it. That may be an option to discuss for that plugin or to build in mind in this one.

dshanske avatar Jun 15 '16 12:06 dshanske

The last set of commits added the ip but not the user agent.

dshanske avatar Jun 15 '16 13:06 dshanske

Yeah sorry I just saw that wp_new_comment should handle the IP and user-agent by itself and removed my comment.

Ruxton avatar Jun 15 '16 13:06 Ruxton

Yes, but the biggest issue is that Akismet runs before Semantic Linkbacks, therefore it doesn't have all possible data yet.

dshanske avatar Jun 15 '16 13:06 dshanske

A bunch of the changes coming down the pike should address that order. I had the same issue with comment notification emails.

dshanske avatar Jun 15 '16 13:06 dshanske

@dshanske but it still should set based on proper defaults. $_SERVER['USER_AGENT'] and $_SERVER['REMOTE_ADDR'] (and it's various forward friends) will be available at the time wp_new_comment fires in class-webmention-receiver.php?

Perhaps Akismet is still learning how to process "webmention" types (some types get handled differently) and needs our help, it does learn. Every time you approve or mark a comment as spam it should post to the /submit-ham or /submit-spam endpoints in their API.

The standard "comment" the plugin is generating is kind of spammy content too: 'This %s was mentioned on <a href="%s">%s</a>' (obviously this is filtered and adjusted by other plugins, but it's too late by then)

Maybe setting that href to rel="nofollow" would aid in it's favour? The spec even suggests this and if a link is displayed back to the source, SHOULD link to source with rel="nofollow" to prevent spam.

Ruxton avatar Jun 15 '16 13:06 Ruxton

I emailed Akismet. They are willing to take a look if we give them the API keys so they can check the logs.

dshanske avatar Jun 15 '16 13:06 dshanske

It isn't just Akismet though...we need better options overall to control this.

dshanske avatar Jun 15 '16 13:06 dshanske

I've recently implemented @dshanske's option 3 on a friend's site, detailed here: https://gregorlove.com/2017/12/one-of-the-known-issues/ and in the subsequent note. I hadn't seen this issue, but he directed me to comment here with my thoughts.

I'm not a daily WordPress user myself [so take with a grain of salt :)], but I think this is a good baseline functionality to include in the plugin. I would expect if I had checked “Comment author must have a previously approved comment” that it would apply to webmentions as well. I think it's a better user experience for indieweb site webmentions. Tweet spam is a concern to be addressed, but personally I prioritize the indieweb mentions as the baseline.

David mentioned he was working on a PR to whitelist comment domains. I haven't seen the code so don't know the details, but perhaps it could work as an additional layer on top of this.

[Hypothesizing]

  1. Enter a domain name that requires additional validation against the whitelist (https://twitter.com)
  2. In another field enter individual URLs under that domain that are whitelisted (https://twitter.com/gRegorLove)
  3. When a comment has a semantic_linkbacks_canonical starting with a domain in step 1
    • Override my snippet above to check if the semantic_linkbacks_canonical starts with one of the URLs in step 2

Twitter is the immediate example here, but it applies to any domain that has multiple profiles. This would work the same way to whitelist mentions from example.com/janedoe but not from example.com/johndoe.

Related: https://github.com/pfefferle/wordpress-semantic-linkbacks/issues/38

gRegorLove avatar Dec 31 '17 21:12 gRegorLove

Bumping this old issue, I'm noticing it a lot more now. I have this in my functions.php to prevent webmentions from ever getting marked as spam, but evidently it no longer works with 5.x. I'd vote to make this an option or even just do it for all users, but that's just me. 😁

function unspam_webmentions($approved, $commentdata) {
  return ($commentdata['comment_type'] == 'webmention' ||
		  get_comment_meta($commentdata['comment_ID'], 'semantic_linkbacks_type', true)
		  ) ? 1 : $approved;
}
add_filter('pre_comment_approved', 'unspam_webmentions', '99', 2);

snarfed avatar Oct 31 '23 13:10 snarfed

Needs an update of the old thought I think.. but the overriding theme... updating the moderation handling I agree

dshanske avatar Oct 31 '23 13:10 dshanske