adblock-rust
adblock-rust copied to clipboard
When source_url is missing, positive plus negative domain constraints would make the rule to always match
Calling check_network_urls
with an empty source_url
causes domain constrained rules, that has both negative plus positive, to match no matter what.
The problem is in check_options
function in network.rs
. The following code:
if let Some(included_domains) = filter.opt_domains.as_ref() {
if let Some(source_hashes) = request.source_hostname_hashes.as_ref() {
// If the union of included domains is recorded
if let Some(included_domains_union) = filter.opt_domains_union {
// If there isn't any source hash that matches the union, there's no match at all
if source_hashes.iter().all(|h| h & included_domains_union != *h) {
return false
}
}
if source_hashes.iter().all(|h| !utils::bin_lookup(&included_domains, *h)) {
return false
}
}
}
if let Some(excluded_domains) = filter.opt_not_domains.as_ref() {
if let Some(source_hashes) = request.source_hostname_hashes.as_ref() {
// If the union of excluded domains is recorded
if let Some(excluded_domains_union) = filter.opt_not_domains_union {
// If there's any source hash that matches the union, check the actual values
if source_hashes.iter().any(|h| (h & excluded_domains_union == *h) && utils::bin_lookup(&excluded_domains, *h)) {
return false
}
} else if source_hashes.iter().any(|h| utils::bin_lookup(&excluded_domains, *h)) {
return false
}
}
}
Imagine the following use case
|http://$image,domain=domain.com|~sub.domain.com
This rule should match on images, coming from the domain domain.com
with an exclusion for sub.domain.com
.
The following check should pass, but it will block:
adblocker.check_network_urls(
url="http://some-website.com/image.jpg",
source_url="",
request_type="image",
)
The solution should be and exclusion in the if statement. If there are domains inside filter.opt_domains
and there is no request.source_hostname_hashes
, it should never match.
BTW, after fixing this bug I found another bug. Fixing this bug made check_ws_vs_http_matching
failing.
The test looks like this:
let network_filter = NetworkFilter::parse("|ws://$domain=4shared.com", true).unwrap();
assert!(network_filter.matches(&request::Request::from_urls("ws://example.com", "4shared.com", "websocket").unwrap()));
Changing "4shard.com"
to http://4shared.com
fixed the problem. Now there is a question here. Does a WebSocket send an http referrer in a URL format or it is just a domain. If it will always send a URL, we should fix the tests to a URL rather than a domain. If there is a possibility they send domains, there is a problem with parsing the source_url from the domain (which is reasonable), and we need to fix that. Either way, I think we should add negative tests there to show that there is a bug right now.
As a piece of evidence, the following code passes.
let network_filter = NetworkFilter::parse("|ws://$domain=4shared.com", true).unwrap();
assert!(network_filter.matches(&request::Request::from_urls("ws://example.com", "3shared.com", "websocket").unwrap()));
It raises another question. Now that I think about it, why do we even compare it against the source_url
and not against the URL of the request? When the request_url
is an absolute URL, the comparison should be against the URL and not against the source URL.
Good catches, That check_ws_vs_http_matching
test is definitely wrong, those need to be full URLs.
If I understand your last question correctly, we need source_url
because there are some rules which are applied differently depending on the page the request is made from. For example, ||a.com$domain=b.com
will only block requests to a.com
made from b.com
. It's also taken into account for $first-party
and $third-party
options.
Since check_network_urls
really depends on source_url
to make a correct decision, there should realistically be an error returned if it's not supplied, or doesn't parse correctly.
I'm not actually that familiar with WebSocket referrers, but it shouldn't matter - source_url
should just be the top-level URL of the page from which the WebSocket was initiated.
If you need any help with fixing the bug, I'd be happy to help.