FiltersRegistry
FiltersRegistry copied to clipboard
Privacy: ClearURLs
Prerequisites
- [X] I checked the documentation and understood it;
- [X] I checked to make sure that this issue has not already been filed;
Problem description
The ClearURLs database might be be transformed into a powerful privacy-enhancing filterlist &/or userscript.
Proposed solution
The specs @ https://docs.clearurls.xyz/latest/specs/rules/ would be utterly necessary to transform this to something end-usable.
Additional information
Originally found via https://github.com/svenjacobs/leon/discussions/315#discussioncomment-9809441, where several interrelated projects are thinking of how to incorporate this database themselves.
Alpha/Beta:
https://github.com/DandelionSprout/adfilt/blob/master/ClearURLs%20for%20uBo/clear_urls_uboified.txt https://raw.githubusercontent.com/DandelionSprout/adfilt/master/ClearURLs%20for%20uBo/clear_urls_uboified.txt
Definitely also see https://github.com/DandelionSprout/adfilt/discussions/163
Added years ago: https://github.com/AdguardTeam/FiltersRegistry/tree/master/filters/ThirdParty/filter_251_LegitimateURLShortener - https://github.com/AdguardTeam/FiltersRegistry/commit/65694c61d5fc8ea98782285edc14291c80d8c73a (https://github.com/AdguardTeam/FiltersRegistry/issues/401)
I do use LUS, but am hoping to improve coverage for these trackers.
Not identical, ~~but now I think LUS is a derivative of ClearURLs (& probably other sources), so maybe this is duplicate in some sense?~~ If you'd comment on the relationship between the 2, @DandelionSprout, it'd help.
Conflict of interest disclaimer: I am the assistant maintainer of the Actually Legitimate URL Shortener Tool, and current maintainer of the ClearURLs for uBo list (I did not create the original ClearURLs for uBo list; credit for that goes to rustysnake)
DandelionSprout's LUS is a derivative of ClearURLs (& probably other sources)
It is not. While a few filters have been copied from elsewhere (with credit), most have been manually added either based on user reports or tracking parameters Imre (and I) found. Thank you
@iam-py-test Thanks very much for answering. 🙇🏾♂️ Could you comment on how different the contents of the 2 lists are from each other?
The Actually Legitimate URL Shortener, as described, is a variety of rules manually added by Imre (DandelionSprout) and me. ClearURLs for uBo uses a Python script to convert the ClearURLs rules into a filterlist for uBlock Origin and AdGuard (basically what you requested here). There are a few modifications to remove problematic rules, but largely it's just the ClearURLs rules. Thanks
In theory, I could potentially have attempted to merge relevant entries from ClearURLs into LUS, which I can only presume would be a win-win for most parties.
@DandelionSprout 🙇🏾♂️ Actually, if the contents are that different, it'd make sense to keep them separate, & offer each as AG options to supplement each other & AG's other Privacy filterlists. OTOH, if the included rules overlap significantly, then it would make sense to use 1 as another source for the other, to keep down duplication.
So, I ran a comparison this morning about whether ClearURLs had any coverage that LUS didn't. I decided to test with Amazon, a high-coverage site in both lists.
LUS had well above 80 entries for Amazon (70 of them being specific entries). Only 2 entries that made sense (e.g. not ones like keywords or _encoding) had been in ClearURLs but not in LUS.
Although I do have conflicts of interest in the matter, I'd say that at this point ClearURLs has been obliterated in comparison. I give iam-py-test full 100% rights to make the calls on the following, with no interference from me, but I personally am getting unsure if a ClearURLs list conversion would be considered necessary nowadays. 😓
That's reasonable methodology. Possible to be more comprehensive over domain variety, like this is for TLD variety? I've a hunch that far-less-well-known sites than Amazon may have wider coverage on ClearURLs.
Possible to be more comprehensive over domain variety, like https://github.com/StevenBlack/hosts/issues/1181#issuecomment-608229213?
Given both lists have many global (applies to all websites) rules, measuring such coverage would be difficult.
It is definitely worth testing which permissions deactivate the global removeparam (AdGuard only):
removeparam rules can also be disabled by
$documentand$urlblockexception rules. But basic exception rules without modifiers do not do that. For example,@@||example.com^will not disable$removeparam=pfor requests to example.com, but@@||example.com^$urlblockwill.
Then the script "user.js" with API to edit parameters will probably work better on locked ranges.
https://adguard.com/kb/general/ad-filtering/create-own-filters/#urlblock-modifier
Hi! According to our rules, it should be the filter that oriented towards browser content blockers as mentioned Legitimate URL Shortener here.
As currently one list pulls in 100% of the rules of the other is a bit like that (I have not checked how it is done, for example, with the reduction of duplicates on the script side before publishing the list update).
The only thing that worries me is something like the mode of deactivation of cosmetic filters on https-sensitive sites - here with rules we can also deactivate the removal of parameters completely.