Properly handle domains ending with "`.`"
This arises from...
- #2455
We should
- [ ] Add domains that end in "
." to tests - [ ] Ignore domains that end in "
."
Could we strip the . off the end?
example.com. and example.com are the same, right?
Thanks
@iam-py-test my feeling is, if a domain ends in ., like any of the following, we should reject the domain.
Otherwise we risk Type I errors with sentence text and other potential garbage in source hosts files.
# reject the domain
example.com.
# reject the domain
127.0.0.1 example.com.
# reject the domain, allowing the other 3
127.0.0.1 example.com. a.com b.com c.com
127.0.0.1 a.com b.com example.com. c.com
127.0.0.1 a.com b.com c.com example.com.
Ah, I didn't think of that. Thanks
I used to wonder about this myself: https://github.com/FiltersHeroes/KADhosts/issues/28#issuecomment-607294633
It is best to check the RFCs where such a dot makes the page not blocked, e.g. AdBlock and Adblock Plus for years do not know how to address this (you have to double entry (cosmetic hide sections of page) and write separate regexes, e.g. French list to block ads at network level).
Embracing multiple domains in a line should not be difficult, you would look for a dot and a white character and then do a validation of the public suffix (for minimizing the chance that something has the wrong "TLD").