leon icon indicating copy to clipboard operation
leon copied to clipboard

Add Custom Rules

Open KibaNoOu opened this issue 2 years ago • 14 comments

Hi, Would be possible to add custom rules, because sometimes the app doesn't clean residual paramter of the url like the one after a "?"

KibaNoOu avatar Sep 20 '22 18:09 KibaNoOu

Hi @KibaNoOu,

do you have an example of a URL which is not properly cleaned, please?

Also what do you expect of custom rules? Is it okay to just enter exact parameter names to be removed or do you require something more sophisticated like regular expressions?

svenjacobs avatar Sep 21 '22 05:09 svenjacobs

Hi Sven, Here's an example URL: https://www.ebay.it/itm/175311733713?mkcid=16&mkevt=1&mkrid=711-127632-2357-0&ssspo=k-vapwf5rtw&sssrc=2349624&ssuid=n5H50APCTfe&var=&widget_ver=artemis&media=MORE

And would be great to have both exact parameter and regular expression.

Keep up the good work!

Screenshot_20220923-124501

KibaNoOu avatar Sep 23 '22 10:09 KibaNoOu

In this case I could add a specific sanitizer for eBay links if you tell me which parameters can be safely removed.

Regarding custom rules and regular expressions: Regex are very powerful. Allowing users to specify regex rules could potentially break the functionality of the application if there is an error in the expression. I need to think about how to deal with this possibility.

svenjacobs avatar Sep 23 '22 12:09 svenjacobs

For the eBay sanitizier everything after the first ? should be discarded.

So from this: https://www.ebay.it/itm/175311733713?mkcid=16&mkevt=1&mkrid=711-127632-2357-0&ssspo=k-vapwf5rtw&sssrc=2349624&ssuid=n5H50APCTfe&var=&widget_ver=artemis&media=MORE

To This: https://www.ebay.it/itm/175311733713

Regarding the Regex, you could hide it behind advanced option so only who really wants that function should enable it with all the warning of course.

KibaNoOu avatar Sep 23 '22 13:09 KibaNoOu

The eBay sanitizer is available in version 1.2.0.

svenjacobs avatar Oct 03 '22 14:10 svenjacobs

I'm interested by a feature of "custom rules" too :)

Details:

  • I use Android + Rethink DNS
  • sometime we can meet URLs like https://example.tracker.com/?arg1=example1&url=www.google.com&arg3=example2
  • in this context:
    • Rethink DNS will block example.tracker.com
    • I have to manually extract the url www.google.com to be able to reach my website
  • if we can create custom rules we could imagine to say:
    • I want to extract the value after url= and before &arg3

In this example the URL is clean (www.google.com) but in the real life the extracted URL needs probably to be sanitized to be usable :)

brsysadmin avatar Jan 24 '23 11:01 brsysadmin

@KibaNoOu @brsysadmin I've been thinking about the custom rules feature. Please provide your feedback in the discussion item.

svenjacobs avatar Mar 12 '23 19:03 svenjacobs

@gpsnomad's suggestion in https://github.com/svenjacobs/leon/discussions/162#discussioncomment-5846301 might be a good stopgap until this is finalized:

Any chance you could build an option that just strips all parameters? Ie parses the link and stops at the first question mark that it finds? That would be good enough for me, rather than a custom sanitizer for each domain.

TPS avatar May 28 '23 13:05 TPS

@gpsnomad's suggestion in #162 (reply in thread) might be a good stopgap until this is finalized:

Any chance you could build an option that just strips all parameters? Ie parses the link and stops at the first question mark that it finds? That would be good enough for me, rather than a custom sanitizer for each domain.

Could be a really simple but effective solution!

KibaNoOu avatar May 28 '23 17:05 KibaNoOu

@TPS @KibaNoOu The thing is, we don't know for sure what parameters could be removed without breaking an URL. Of course we could remove all query parameters from an URL but some URLs, like the Amazon product link from a shopping cart, encode some optional parameters in path arguments (see /ref=…). But usually path arguments are required, not optional.

svenjacobs avatar May 30 '23 06:05 svenjacobs

The thing is, we don't know for sure what parameters could be removed without breaking an URL.

But usually path arguments are required, not optional.

That's why, as a stopgap, it'd be worth making stripping everything unknown a function (like Decode URL — wait, is this what Extract only URL is supposed to do? I never did figure that 1 out).

E.g., for Amazon, it's increasely evident that almost any of their product URLs can be rewritten into 1 format just keeping the 1 ASIN parameter, but, the non-product Amazon URLs, 1 can just strip everything ? & ref onwards, & it's mostly good. It certainly doesn't hurt to try such everyplace.

TPS avatar May 30 '23 11:05 TPS

There's this app which allows custom rules, open source and also written in Kotlin. Maybe it could be used for reference?

NikunjKhangwal avatar Mar 30 '24 08:03 NikunjKhangwal

@NikunjKhangwal Oh wow that is a nice one too. Looks like it supports custom rules via java script code which is not so intuitive but I guess it makes it quite flexible :)

farOverNinethousand avatar Mar 30 '24 09:03 farOverNinethousand

Yeah I'm not a developer so i don't know how exactly these things work but I just shared it so maybe dev can get some help 😅

NikunjKhangwal avatar Mar 30 '24 09:03 NikunjKhangwal