faker
faker copied to clipboard
Faker is unsafe to use in automated tests - real urls and email addresses are generated
Describe the bug
Faker::Internet.url, and Faker::Internet.email generate potentially real domain names and email addresses. Using potentially real urls and email addresses in automated testing environments is a security vulnerability. Faker is extensively used in automated test environments to generate test data.
To Reproduce
Call Faker::Internet.url
and see that the result is a url that could resolve to a real domain name, with a tld of .com
.
Expected behavior
A TLD of .test
or .example
should be used instead. See RFC2606
Additional context
I understand that some users would like to see .com
urls in their Faker output. I'd like to suggest that this should be easy to do by opting in to this behaviour instead of it being the default. Perhaps a parameter for specifying the tld when generating an email address or url? The default behaviour however should be safe.
I can tackle this issue.
@patrick-beep please check https://github.com/faker-ruby/faker/pull/2506
Faker::Internet.email # => "[email protected]"
Faker::Config.internet_safe_mode = true
Faker::Internet.email # => "[email protected]"
Faker::Internet.url # => "http://marks.example/vern"
Faker::Internet.domain_name # => "brown.example"
We currently have Faker::Internet.safe_email
:
https://github.com/faker-ruby/faker/blob/2333f9ee4a4e1d8329e19f9e4b07e927648add7c/doc/default/internet.md?plain=1#L20
I assume going with this approach of configuring internet_safe_mode
could include adding a deprecation to Faker::Internet.safe_email
. Otherwise, I think having both would lead to confusion.
Thoughts?
I concur that #safe_email
could then be deprecated after internet_safe mode is enabled
I am wondering why we need to add config - if we already have a safe_email
method?
And can we not also add a safe_url method.
From my perspective it seems like a smaller change, also more inline with the existing convention perhaps.
@fbuys to be better internet citizens, we want to update Faker to create safe emails by default. Especially since Faker is often used in automated test environments that may accidently spam a real person or system.
Oh I got it @Zeragamba, what if we did it the other way around though. Instead we remove safe_email
and return safe email addresses with email
and then real_world_email
would generate "unsafe" email addresses.
I'd still prefer to leave it as #email
so that we can keep the api surface clean. plus we can also allow the user to override the safe mode if needed by allowing them to specify a domain to use instead of example
or perhaps allow internet_safe: false
as needed
Oh yes, I like the idea of passing an override in. So I'd vote for: internet_safe: false
to turn safe emails off.
We would like this but/and...
We use a "dud domain" checker that would break on these emails. ie, our app prevents users from using a [email protected]
or even a [email protected]
address.
We do have an internal domain we do allow that is "real", but won't "spam anyone".
Can this feature have a "set what qualifies as a safe domain" config?
eg:
internet_safe: false
internet_safe_domains: ["myinternaldomain.com"]
These are the changes that will happen when my PR is merged
(https://docs.google.com/spreadsheets/d/1Sec5OcigkhJnlyYd8GoE8kCbs4OyoFOR1oI9fP9c6zY/edit?usp=sharing)
it introduces a few configs too
internet_safe_mode= true / false
internet_safe_mode?
internet_safe_domains= ["zzz", "xxx", "ccc"]
internet_safe_domains
what do you think? https://github.com/faker-ruby/faker/pull/2506
- edit, sorry this is a my another GitHub account, same as @saiqulhaq
I want to revisit this issue at its core before jumping to implementation ideas.
The real issue here is Faker generating real URL domain names and email addresses.
RFC2606 recommends using Reserved Top Level DNS Names as the best current practices for the internet community:
"To reduce the likelihood of conflict and confusion, a few top level domain names are reserved for use in private testing, as examples in documentation, and the like. Depending on the nature of the test or example, it might be best for it to be referencing a TLD permanently reserved for such purposes.
To safely satisfy these needs, four domain names are reserved as listed and described below.
.test
.example
.invalid
.localhost
".test" is recommended for use in testing of current or new DNS related code.
".example" is recommended for use in documentation or as examples."
See RFC2606 for more details.
So the top-level domain recommended is .test
for the generators and .example
for documentation.
The option of disabling a "safe" configuration feels odd to me. We are either safe or not. We choose to follow the security recommendations.
And although Faker::Internet.safe_email
says it generates an RFC 2606 compliant fake email, it is not entirely true because example.
is an actual address that is being used and could be delivered. .test
is the only reserved TLD for testing.
I realize this is not a popular choice, and the decision is not only mine to make, but I'd vote for only using .test
as the TLD for all URLs and email addresses on Faker. In the docs, we guide the users not to use Faker URLs or email addresses since we would breach the security recommendations.
If users need to test with real users, I recommend using Faker to generate the fake usernames to append to their domain of choice.
We will leave that security choice to them, but as a library many people use, we don't want to ignore the security best practices.
We have "fake" in the name, after all. We shouldn't be generating real URLs and email addresses.
yea agreed if we do that, then we should release a new major version for breaking change indication to faker's users
@stefannibrasil thanks for the clear explanation. I agree with what you said. And I think the solution will make faker easier to maintain too.
We just released v3 so maybe this one could be released in the beginning of next year :thinking:
We also need to add deprecation warnings to use in preparation for the breaking change.
I'd like to hear more from the other maintainers too, @vbrazo @thdaraujo @Zeragamba
I think the next steps could be adding a deprecation warning in preparation for the release of Faker 4.0.
Quick update: I'm working on adding the deprecation messages to go out on the next release.
I've opened up the first PR here, constructive feedback is welcome:
https://github.com/faker-ruby/faker/pull/2709
After it's merged, the Changelog will announce why we are moving in this direction, and hopefully it will be a smooth transition for users.
Once we give enough time for users to switch to use the safe_email and safe_url only, faker 4 will be released with this breaking change.
We came up with a new approach here: https://github.com/faker-ruby/faker/pull/2733