v2
v2 copied to clipboard
Rewrite Rule Replace issues
I'm trying to use the replace function in order to eliminate sentences (mostly ads). I'm having trouble with umlaut characters (ä, ö, ü) and , . : ; Sometimes I quote , and it works, mostly it fails. What can I do to improve it?
Miniflux just uses ReplaceAllString from the standard go regex library under the hood in rewrite_functions.go which should have support for unicode. The search term is a regular expression so if you want to match a literal period for example, you'll have to escape it.
Ok. Thank you for your quick response. How do I quote ü ö ä and sentences with ","?
Can you link an example feed/entry and your rewrite rule where this issue happens? It might be that the website uses html escape codes instead of plain unicode characters, not sure how miniflux handles that
Yes, sure:
Here the link to the RSS feed source: https://www.tagesspiegel.de/contentexport/feed/home For example some news: https://m.tagesspiegel.de/gesellschaft/panorama/zusammenstoss-von-zwei-kleinbahnen-mehr-als-210-verletzte-bei-zugunglueck-in-kuala-lumpur/27218686.html https://m.tagesspiegel.de/sport/wie-realistisch-ist-der-direkte-wiederaufstieg-werder-bremen-und-das-schlechte-timing/27218310.html
I want to replace this sentence: [Wenn Sie aktuelle Nachrichten aus Berlin, Deutschland und der Welt live auf Ihr Handy haben wollen, empfehlen wir Ihnen unsere App, die Sie hier für Apple- und Android-Geräte herunterladen können.]
This is anther feed source: http://feeds2.feedburner.com/stadt-bremerhaven/dqXM Example news: https://stadt-bremerhaven.de/apple-tv-4k-2021-und-sky-q-derzeit-viele-beschwerden/
And I want to replace the sentence: In diesem Artikel sind Amazon-Links enthalten. Durch einen Klick darauf gelangt ihr direkt zum Anbieter. Solltet ihr euch dort für einen Kauf entscheiden, erhalten wir eine kleine Provision. Für euch ändert sich am Preis nichts.
Hope that helps you and me. Thank you so much for your help!
I made your tagesspiegel example work with this: replace("\[Wenn Sie alle aktuellen Nachrichten live auf Ihr Handy haben wollen. empfehlen wir Ihnen unsere App. die Sie.*herunterladen können\.\]"|"") (I had a slightly different ad text). You'll have to search the entry html for the part you want to replace and then you can use something like https://regex101.com set to Golang mode to test your regex.
Commas are indeed a bit weird, there is currently no way to have a comma in your search or replacement string because miniflux thinks it is a delimiter for a new rewrite rule (source). You can ignore them with regex periods for now. This should probably be improved in the future.
I think you should be able to do the second example yourself as an exercise :smile:
Thank you so much!
Exercise done:
replace("In diesem Artikel sind.*Preis nichts\."|"")
With one of recent updates my rule for catching youtube urls in iframes from https://www.angrymetalguy.com/feed/ broke My rule was as follow:
replace("<iframe.*(https:\/\/.*youtub.*?)".*?<\/iframe><\/p>"|"<a href=$1>$1</a>")
is it possible to escape "
@Tokariew, I have the same problem since a couple of releases.
The tagesspiegel rewrite rule replace("\[.*\]"|"") is not working at all.
Does anybody know how to fix it?
@derflasher No, Idea, Tried to escape the " with \ but with no effect… In logs i only get <input>:1:9: invalid char escape
Unless someone found solution, or it's fixed, I will stay with pinned miniflux to version 2.0.32
@derflasher i was able to rewrite my rules using . to match " in text
but it seems that replace don't use from 2.0.32 full regex.
Using \ to escape special characters or use to match for example \s is not possible
Yes, @Tokariew, it still does not replace full regex 😒 I always stick to the latest version, so I'll text you one day 😂