pihole_adlist_tool icon indicating copy to clipboard operation
pihole_adlist_tool copied to clipboard

ABP style adlists

Open dowden20 opened this issue 1 year ago • 4 comments

Report currently does not show the number for ABP style adlists. Please consider include calculation for ABP style adlists.

Thank you

  [i]  Adlist coverage

id   enabled  total_domains  domains_covered  hits_covered  unique_domains_covered  address
---  -------  -------------  ---------------  ------------  ----------------------  ---------------------------------------------------------------------------------
413  1        0                                                                     https://big.oisd.nl/
469  1        0                                                                     https://raw.githubusercontent.com/hagezi/dns-blocklists/main/adblock/multi.txt
470  1        0                                                                     https://raw.githubusercontent.com/hagezi/dns-blocklists/main/adblock/tif.txt

dowden20 avatar Jun 12 '23 15:06 dowden20

You discovered one issue (total_domains being empty): https://github.com/pi-hole/FTL/issues/1573

The rest is a feature request and I'm not sure if there is a feasible way to solve it: ABP style domains are handled as a special kind of "RegEX" within FTL and I'm not sure if there is good way to handle them within bash. And even if I find a way to treat them as bash RegEx it will be painfully slow on lists like https://big.oisd.nl/ to check every domain queries against all adlist entries. (This is the reason why RegEx checking is not enabled by default).

yubiuser avatar Jun 12 '23 18:06 yubiuser

Hi! I would also find it very helpful to use ABP-style lists with Pi-hole. The list I'm looking at is effectively just a list of domains, but in ABP format: https://v.firebog.net/hosts/Admiral.txt

Examples:

||2znp09oa.com^
||2znp09oa.com^
||35.186.219.42^
||35.186.249.84^
||35.190.48.184^
||35.190.58.50^
||35.190.62.199^

I was able to get the domains recognized by adding:

    -e 's/^\|\|(.*)\^$/\1/' \

To this sed command (currently around line 658 in /opt/pihole/gravity.sh):

  # 2) Remove carriage returns
  # 3) Remove lines starting with ! (ABP Comments)
  # 4) Remove lines starting with [ (ABP Header)
  # 5) Remove lines containing ABP extended CSS selectors ("##", "#!#", "#@#", "#?#") preceded by a letter
  # 6) Remove comments (text starting with "#", include possible spaces before the hash sign)
  # 7) Remove leading tabs, spaces, etc. (Also removes leading IP addresses)
  # 8) Convert from ABP format: ||some.domain.here^ --> some.domain.here
  # 9) Remove empty lines

    sed -i -r \
    -e 's/\r$//' \
    -e 's/\s*!.*//g' \
    -e 's/\s*\[.*//g' \
    -e '/[a-z]\#[$?@]{0,1}\#/d' \
    -e 's/\s*#.*//g' \
    -e 's/^.*\s+//g' \
    -e 's/^\|\|(.*)\^$/\1/' \
    -e '/^$/d' "${destination}"

(Note the above snippet has the new expression and a matching comment.)

This doesn't solve the full problem of handling fancy ABP patterns, but it might be worth adding to take advantage of the many hosts- / domains-only lists out there.

Happy to open a PR for this, but honestly, it took me long enough to even find your Github org, and I still haven't figured out exactly where gravity.sh lives in your various repos...

Current versions:

  • Pi-hole v5.17.1
  • FTL v5.23
  • Web Interface v5.20.1

Thanks again!

tkil avatar Jun 20 '23 04:06 tkil

@tkil

I'm not sure what you try to archieve with this RegEx and if this should improve gravit.sh within Pi-hole or my adlist tool. You can find gravity.sh here: https://github.com/pi-hole/pi-hole/blob/master/gravity.sh

yubiuser avatar Jun 20 '23 07:06 yubiuser

@yubiuser Ah, I might have misfired -- Sorry for the noise. I'll make this suggestion over in the pi-hole repo.

Thanks for the redirect!

tkil avatar Jun 20 '23 18:06 tkil