Add honeypot for email capture in footer on mozilla.org
Description
After discussion, we have decided to try a honeypot on our email capture form, before exploring captchas or additional email address validation.
The type of honeypot is TBD, and could be determined by the developer, or discussed here.
Note that we already have a HoneyPotWidget class available here, that might prove useful: https://github.com/mozilla/bedrock/blob/db91287ece1b2dca016a0c2c8bf51da4e86953bf/bedrock/mozorg/forms.py#L40
Is there JS that handles that field being populated?
Is there JS that handles that field being populated?
The field is simply hidden from the front-end as far as i understand - so in theory bots would enter dummy information into it which we could then reject and throw an error.
We'd most likely need to make an addition to the JS in protocol to check that the honeypot field has an empty value: https://github.com/mozilla/protocol/blob/main/assets/js/protocol/newsletter.js
@slightlyoffbeat can you give an example of the main URLs where this seems to be a problem / coming from?
Another angle on this could be to (in addition to a trad honeypot field where, which we could police on the Basket end) we add a dynamically-generated-by-Python signature (eg HMAC using a timestamp + nonce + a shared secret) to the newsletter form in a hidden field, which should always get sent to Basket, which can then verify the signature is a) present and b) legit.
Because we cache pages in the CDN for 10 mins, we'd have to allow the verification to accept timestamps that are up to 10-and-a-but mins old, but the need for that signature + the use of a honeypot still significantly reduces our risk window for bot spam.
This may be over-thinking - @pmac @robhudson ?
One thing that I'm not 100% on - if these submissions are coming from bots, then presumably those bots are executing JS, since our newsletters are now relent on JS to post directly to basket. In that case, are they automating hitting the website directly? If true, then will a honeypot likely work?
I don't know yet, either - @slightlyoffbeat may know more
If they're triggering the JS, then the honeypot may work unless they're good enough to spot a honeypot.
If they're hitting Basket directly with HTTP calls, then the signature/HMAC approach could help a lot.
Some newsletter management is JS directly triggering Basket API endpoints.
Some forms are still handled on bedrock, but most likely not the homepage footer newsletter anymore.
Some is still og form=action but is progressively enhanced to ajaxify the reponse, and may point e.g. to Basket subscription pages right away.
I believe the submission spam mentioned in this thread, coming from the redesigned homepage footer newsletter component, points directly to Basket. So any honeypot value added on bedrock form inputs needs to be evaluated — with any value control and rejection logic — subsequently on the receiving Basket side, for this specific form mentioned in the OP here.
Alternatively, if we're thinking (or better, seeing proof) that bots are hitting basket directly, it might be simpler to enable/tune a WAF in front of basket
@localjo - I'm happy to explore that side with SRE, including confirming what we've currently got set up as a WAF, if this approach sounds like it'll help
@stevejalim Yes, I think it would be helpful to check with SRE for more details to help inform the best solution here.
I'm still waiting on SRE - will chase