tsurlfilter
tsurlfilter copied to clipboard
ad filtering syntax: `removeparam` modifier doesn't help to clean junk query params that are preceded by a hash (`#`)
Prerequisites
- [X] I checked the documentation and found no answer;
- [X] I checked to make sure that this issue has not already been filed;
- [X] This is not an ad/bug report.
Problem description
- Be a URL with a few rather classic junk query params (
at_medium
,at_campaign
, etc) — but preceded by a#
:
https://www.france.tv/films/5606040-une-affaire-privee.html#at_medium=5&at_campaign_group=2&at_campaign=integrale&at_offre=1&at_send_date=20240106&at_recipient_id=459386-1664366309-2d5f2440
- Be a custom filter rule (which works all right as long as the query params are preceded by a
?
or an&
):
||*$removeparam=/^(at|ul|utm)_/
Expected behavior
With the aforementioned rule being set, the URL should be rendered as
https://www.france.tv/films/5606040-une-affaire-privee.html
Actual behavior
The URL remains unchanged. On the other hand, the filter works all right as long as the query params are preceded by a ?
or an &
.
Proposed solution
Not sure, but probably the ^
set of separator characters (separator marks) should also include the hash (#
).
Excerpt from the KB:
Special characters
(...)
^
— a separator character mark. Separator character is any character, but a letter, a digit, or one of the following:_
-
.
%
. In this example separator characters are shown in bold:http:
//
example.com
/?
t=1
&
t2=t3
. The end of the address is also accepted as separator.
Additional information
No response
Parameters after #
are not send.
Where did you get this link?
I think that the reporter minds that a website can send the hash using location.hash
and XHR/fetch.
Where did you get this link?
From their newsletter.
For instance, see its last issue:
https://t.nl.francetv.fr/r/?id=hc9785a1,6c509b4c,5fd1bede&p1=%40UYYMNRAcUXhUaReRIlHftNHzxSzNS0B3t5dfPCFeDjM%3D&p2=20240110&p3=459386-1664366309-2d5f2440
While redirecting to the target URL (the issue layout), you'll see the aforementioned hash-preceded params somehow appear appended at the end of that URL.
You'll also see them appended to each content URL featured in the issue.
I think that the reporter minds that a website can send the hash using
location.hash
and XHR/fetch.
Thank you for the concern. There's another reason why I want my URLs to be clean of any garbage query params: so that I don't have to clean them manually when I save or share them.
For instance, see its last issue:
I need an address of the page, which adds parameters to links. $removeparam
can't remove it, because #...
added by JS or just links in html contain that.
$removeparam
can't remove it, because#...
added by JS or just links in html contain that.
Is there any other modifier, other than removeparam
to remove hash params? If not, shouldn't it be created? Is it possible, at all?