FilterLists icon indicating copy to clipboard operation
FilterLists copied to clipboard

My New Filterlist

Open thedoggybrad opened this issue 2 years ago • 10 comments

Raw: https://raw.githubusercontent.com/thedoggybrad/supersecurityfilterlist/main/list.txt

Github: https://github.com/thedoggybrad/supersecurityfilterlist

thedoggybrad avatar Apr 21 '23 08:04 thedoggybrad

I have a few questions (not affiliated with this project, just curious). Where do you get the entries for this list? The README mentions Phishing Domain Database and The Big List of Hacked Malware Web Sites Also, there are a lot of duplicated entries: image Thanks!

iam-py-test avatar Apr 21 '23 10:04 iam-py-test

@iam-py-test , what tool did you use to find those duplicates? Just curious.

collinbarrett avatar Apr 21 '23 12:04 collinbarrett

Visible in uBO

image

gwarser avatar Apr 21 '23 16:04 gwarser

iam-py-test , what tool did you use to find those duplicates? Just curious.

I used https://abpvn.com/ruleChecker/redundantRuleChecker.html (DandelionSprout recommends it in the adfilt README, that's how I found it), but @gwarser's method works too (though this shows the specific redundant rules). I am working on a PR to remove some of the redundant rules, but there are too many to do by hand and my Python script keeps wanting to change the line endings from CRLF to LF, which makes the diff show I changed every single line.

iam-py-test avatar Apr 21 '23 20:04 iam-py-test

I recently had to deal with this issue on my own blocklist. Here is a snippet of code in Bash to find redundant entries:

while read -r entry; do
    grep "\.${entry#||}$" adblock.txt >> redundant_entries.txt
done < adblock.txt

# The output has a high chance of having duplicates
sort -u redundant_entries.txt -o redundant_entries.txt

This assume your list only has entries in the form of ||example.com^. The code loops through each entry and converts it into a pattern to be matched by grep. grep looks for other entries that are subdomains (of any level) of the current entry. The whole process takes quite long (takes about 45 seconds for my 2300 rule ABP list).

I'm going to feed the redundant entries file into my list building script so it ignores the entries in the file.

jarelllama avatar Apr 26 '23 11:04 jarelllama

I will try to fix those duplicates. I have not checked for it. Let me fix it.

thedoggybrad avatar May 21 '23 22:05 thedoggybrad

I have a few questions (not affiliated with this project, just curious). Where do you get the entries for this list? The README mentions Phishing Domain Database and The Big List of Hacked Malware Web Sites Also, there are a lot of duplicated entries: image Thanks!

What you have said is right. Just compiled them.

thedoggybrad avatar May 21 '23 22:05 thedoggybrad

Also, one small comment on the README. IMO "uBlock" is garbage and shouldn't be recommended as an option to use this list with; it was unmaintained for years and then recently removed it's code from GitHub and started pushing updates again. The developer(s) have done shady stuff in the past (tracking users, stealing code), and doesn't even have a functional options page, so it's not even possible to install any non-default lists in it: image It's also blocked as malicious by several blocklists, including uBo's default badware risks.

iam-py-test avatar May 21 '23 23:05 iam-py-test

@iam-py-test Thanks for that, removing it ASAP on my readme of all my filterlists (Update: Sucessfully removed on the readmes of all my filterlists.)

By the way, the duplication of filters are fixed.

thedoggybrad avatar May 22 '23 01:05 thedoggybrad

@iam-py-test Thanks for making me aware of what is happening on uBlock now. Before it was almost looking like the same as uBlock Origin. What I know is that uBlock is the original one but due to conflicts between 2 repository owners the original owner maked uBlock Origin. Before, I have read some recommendations on uBlock Origin's filterlist (issues on repository) itself suggesting not to use uBlock. Now, the Github code for uBlock has been removed, I was surprised to know that and immediately looked for it myself. I am not actually a fan of uBlock either.

By the way, I am using uBlock Origin on my web browsers. So I am definetly not testing my filterlists on other adblocks.

thedoggybrad avatar May 22 '23 01:05 thedoggybrad