FiltersCompiler icon indicating copy to clipboard operation
FiltersCompiler copied to clipboard

Improve sorting and validation script

Open ameshkov opened this issue 8 years ago • 0 comments

@Alex-302 commented on Wed Jun 21 2017

Feature requests

  • rules validating: For example: ||example.com###advert - the rule is invalid. || are unnecessary for element hiding rules; ||example.com^$$domain=domain.org - two option separators. ||example.com^$domain=domain.org,domain2.com - invalid domain separator. Must be |. If it possible - add validation of CSS rules.
  • black list for the files, which must be processed(the list of the files that should be sorted);
  • committing changes to the repository;
  • show request before/after sorting ;
  • print all processed directories;

Currently our script has a bug with sorting order. Sometimes the order of sorting is violated - sorting of strings occurs alphabetically. Expected sorting process:

  • sorting of element hiding rules in descending order;
    • sorting by CSS selector in descending order;
    • merging the rules with the same CSS selector;
    • removing duplicated domains;
  • sorting of URL rules
    • sorting the rules in descending order;
    • sorting of the rule modifiers(domain=, app=, protobuf=, 'replace=' must be at the and of modifiers list)
    • sorting and merging domains list if the rules contain only $domain= modifier
    • removing duplicates;
  • types of rules that must be ignored(allow sorting as string, but don't change them): JS(#%#), CSS(#$#), content replacing rules($replace), protobuf($protobuf=), Content Security Policy ($csp), application modifier for Android and Windows('$app=')

Comments ! and hints !+ are separators. Rules that are between the comments should be sorted separately from the rest.

Addition about the hints: the rule with hint should not be sorted - !++rule . For example:

!+ HINT
some.rule_zzz
some.rule_bbb
some.rule_aaa

result of sorting is

!+ HINT
some.rule_zzz
some.rule_aaa
some.rule_bbb

@vozersky commented on Mon Aug 14 2017

We might also need to add a duplicate check to the script: https://github.com/AdguardTeam/AdguardForWindows/issues/1850#issuecomment-322088426


@Alex-302 commented on Wed Aug 09 2017

@vozersky

image


@ameshkov commented on Mon Aug 14 2017

@Alex-302 need a better example.

For instance, give us a list with 50 rules and show how it should be sorted.


@Alex-302 commented on Mon Aug 14 2017

Before

!
||graph.facebook.com^$app=jp.gocro.smartnews.android
||graph.facebook.com^$app=com.zynga.crosswordswithfriends
||graph.facebook.com^$app=com.urbandroid.sleep
! razlozhi.ru|yandex.by|yandex.com.tr|yandex.kz|yandex.ru|yandex.ua|syl.ru
/:\/\/(otzovik.com)\/[a-zA-Z0-9-_]*==\//$script,domain=otzovik.com
/:\/\/(otzovik.com)\/[a-zA-Z0-9-_]*=\//$script,domain=otzovik.com
!
!+ PLATFORM(iOS, ext_safari)
||oops.rustorka.com^
!
://*.rumedia.ws^$empty,domain=~rumedia.ws
rustorka.com##a[href^="http://www.gearbest.com/"]
rustorka.com##div[align="center"] > a > img
/(rustorka.com\/forum\/misc\/js\/(?!ifix)(?!ajax)(?!main)(?!fancybox)(?!scrolltopcontrol)(?!jquery)(?!ct)(?!bbcode))/$domain=rustorka.com
@@||rustorka.com/forum/shoutbox_view.php
!+ PLATFORM(ext_ff, ext_opera, ext_ublock)
rustorka.com##.post_body > div[align="center"] > a > img
!
myfin1.by##.menu_level_1 > li.left_bt
myfin2.by##.menu_level_1 > li.left_bt
myfin3.by##.menu_level_1 > li.left_bt
myfin4.by##.menu_level_1 > li.left_bt
bgoperator.ru##.l-index__side_a > canvas[width="188"][height="312"]
razlozhi.ru##._23 > div._68._36
tourister.ru##body > div[onclick="TourWindowCClose();"]
torrent-games.net###ubm_div
inosmi.ru##.banner
||r.mradx.net/img/D2/C79060.mp4
||r.mradx.net/img/20/9D02B4.ogg
esports.mail.ru##.rb-video-widget
esports.mail.ru##a.b-bg[target="_blank"]
seedoff.cc##div[id*="Composite"]
||r.mradx.net/img/D2/C79060.mp4
||r.mradx.net/img/20/9D02B4.ogg
esports.mail.ru##.rb-video-widget
esports.mail.ru##a.b-bg[target="_blank"]
seedoff.cc##div[id*="Composite"]
||animespirit.ru/banners/
fotokto.ru###pageContainer > div[class="m2"]
||simpsonsua.com.ua/photos/baner-reklama/
||cy-pr.com/images/stat/tr.png
miytvideo.ru##body > noindex
goldenshara.net##center > a[href="/goldenshara.net.php?url"] > img
zaycev.online##div[data-cookie="mainBanner"]
||bazr.ru/vast?$xmlhttprequest
||image.winudf.com/*/upload/promopure/$domain=apkpure.com
24auto.ru##.brrr_place
mp3cc.com###headerBanner
mp3cc.com##.playlist-ad
mp3cc.com###footerBanner
anistar.me##body > div[class^="heade-"]
chatovod.ru##.chatlist > tbody > tr[class="bold"]:not([class^="chatitem"])
||chatovod.ru/i/promo2/workle.png

After

!
||graph.facebook.com^$app=jp.gocro.smartnews.android|com.zynga.crosswordswithfriends|com.urbandroid.sleep
! razlozhi.ru|yandex.by|yandex.com.tr|yandex.kz|yandex.ru|yandex.ua|syl.ru
/:\/\/(otzovik.com)\/[a-zA-Z0-9-_]*==\//$script,domain=otzovik.com
/:\/\/(otzovik.com)\/[a-zA-Z0-9-_]*=\//$script,domain=otzovik.com
!
!+ PLATFORM(iOS, ext_safari)
||oops.rustorka.com^
!
/(rustorka.com\/forum\/misc\/js\/(?!ifix)(?!ajax)(?!main)(?!fancybox)(?!scrolltopcontrol)(?!jquery)(?!ct)(?!bbcode))/$domain=rustorka.com
://*.rumedia.ws^$empty,domain=~rumedia.ws
@@||rustorka.com/forum/shoutbox_view.php
rustorka.com##a[href^="http://www.gearbest.com/"]
rustorka.com##div[align="center"] > a > img
!+ PLATFORM(ext_ff, ext_opera, ext_ublock)
rustorka.com##.post_body > div[align="center"] > a > img
!
mp3cc.com###footerBanner
mp3cc.com###headerBanner
fotokto.ru###pageContainer > div[class="m2"]
torrent-games.net###ubm_div
razlozhi.ru##._23 > div._68._36
inosmi.ru##.banner
24auto.ru##.brrr_place
chatovod.ru##.chatlist > tbody > tr[class="bold"]:not([class^="chatitem"])
bgoperator.ru##.l-index__side_a > canvas[width="188"][height="312"]
myfin1.by,myfin2.by,myfin3.by,myfin4.by##.menu_level_1 > li.left_bt
mp3cc.com##.playlist-ad
esports.mail.ru##.rb-video-widget
esports.mail.ru##a.b-bg[target="_blank"]
anistar.me##body > div[class^="heade-"]
tourister.ru##body > div[onclick="TourWindowCClose();"]
miytvideo.ru##body > noindex
goldenshara.net##center > a[href="/goldenshara.net.php?url"] > img
zaycev.online##div[data-cookie="mainBanner"]
seedoff.cc##div[id*="Composite"]
||animespirit.ru/banners/
||bazr.ru/vast?$xmlhttprequest
||chatovod.ru/i/promo2/workle.png
||cy-pr.com/images/stat/tr.png
||image.winudf.com/*/upload/promopure/$domain=apkpure.com
||r.mradx.net/img/20/9D02B4.ogg
||r.mradx.net/img/D2/C79060.mp4
||simpsonsua.com.ua/photos/baner-reklama/

@Alex-302 commented on Tue Aug 22 2017

Feature request:

add support of blacklist file with dead domains list, which must be removed from the rules and the rules which contains only one domain.


@Mizzick commented on Mon Oct 30 2017

@Alex-302 pls give us more details about content rules? example.org$$script[data-src="banner"]


@Alex-302 commented on Mon Oct 30 2017

@Mizzick What exactly? Please take a look https://kb.adguard.com/en/general/how-to-create-your-own-ad-filters#html-filtering-rules-1

ameshkov avatar Feb 14 '18 17:02 ameshkov