h
h copied to clipboard
Investigate spam bot accounts & add deterrent
The last few weeks we have seen a large spike in users registering and posting information to their user profile pages. They are open pages and seem to be used to deep link for SEO traffic via the Hypothes.is website.
Investigations are already under way and CloudFlare auto "captcha like" deterrent has been enabled. However, we are going to wait 3-6 days to see if this has had any effect then make a decision on the following 3 steps:
- Put measures in place that are able to prevent the majority of spam accounts being created
- Find a way to delete existing spam accounts
- Put monitoring in place to help us detect future flare-ups
Tasks
- [ ] @robertknight - To put in Cloudflare anti-bot protection on the email confirmation page
- [ ] @indigobravo - To add notifications in Slack should we get an unexpected surge of sign ups (could be interesting good or bad)
- [ ] To create a list of candidate email domains to remove
- [ ] Vet the list does not contain any annotations
- [ ] Backup existing records
- [ ] Issue a delete of the effected domains
It was suggested:
- We could place a Captcha on the register page and also the email confirmation page
- We think around 90% of the accounts could be accounted for and deleted. However, we should export this data first then delete so we can keep a record for future cases
- Suggestion around keeping the top list of email accounts signing up and monitor high amounts of sudden signups above the norm.
Put measures in place that are able to prevent the majority of spam accounts being created
We might want maintain a list of banned email providers and straight up prevent them at source as well?
We think around 90% of the accounts could be accounted for and deleted. However, we should export this data first then delete so we can keep a record for future cases
If we want to keep this data, then the database is probably the easiest place for us using our current mechanisms. We could:
- Create a new tables "archived_accounts", "archived_groups"
- Use a DB migration to move things into these tables
- This would be a controlled process using our existing pipelines
- It's also reversible if we change our minds (with the caveat that people could grab the names etc)