v2 icon indicating copy to clipboard operation
v2 copied to clipboard

Rewrite Rule Replace issues

Open derflasher opened this issue 4 years ago • 11 comments

I'm trying to use the replace function in order to eliminate sentences (mostly ads). I'm having trouble with umlaut characters (ä, ö, ü) and , . : ; Sometimes I quote , and it works, mostly it fails. What can I do to improve it?

derflasher avatar Apr 27 '21 21:04 derflasher

Miniflux just uses ReplaceAllString from the standard go regex library under the hood in rewrite_functions.go which should have support for unicode. The search term is a regular expression so if you want to match a literal period for example, you'll have to escape it.

ghost avatar Apr 28 '21 13:04 ghost

Ok. Thank you for your quick response. How do I quote ü ö ä and sentences with ","?

derflasher avatar May 13 '21 17:05 derflasher

Can you link an example feed/entry and your rewrite rule where this issue happens? It might be that the website uses html escape codes instead of plain unicode characters, not sure how miniflux handles that

ghost avatar May 21 '21 19:05 ghost

Yes, sure:

Here the link to the RSS feed source: https://www.tagesspiegel.de/contentexport/feed/home For example some news: https://m.tagesspiegel.de/gesellschaft/panorama/zusammenstoss-von-zwei-kleinbahnen-mehr-als-210-verletzte-bei-zugunglueck-in-kuala-lumpur/27218686.html https://m.tagesspiegel.de/sport/wie-realistisch-ist-der-direkte-wiederaufstieg-werder-bremen-und-das-schlechte-timing/27218310.html

I want to replace this sentence: [Wenn Sie aktuelle Nachrichten aus Berlin, Deutschland und der Welt live auf Ihr Handy haben wollen, empfehlen wir Ihnen unsere App, die Sie hier für Apple- und Android-Geräte herunterladen können.]

This is anther feed source: http://feeds2.feedburner.com/stadt-bremerhaven/dqXM Example news: https://stadt-bremerhaven.de/apple-tv-4k-2021-und-sky-q-derzeit-viele-beschwerden/

And I want to replace the sentence: In diesem Artikel sind Amazon-Links enthalten. Durch einen Klick darauf ge­lan­gt ihr direkt zum Anbieter. Solltet ihr euch dort für einen Kauf entscheiden, erhalten wir ei­ne kleine Provision. Für euch ändert sich am Preis nichts.

Hope that helps you and me. Thank you so much for your help!

derflasher avatar May 25 '21 22:05 derflasher

I made your tagesspiegel example work with this: replace("\[Wenn Sie alle aktuellen Nachrichten live auf Ihr Handy haben wollen. empfehlen wir Ihnen unsere App. die Sie.*herunterladen können\.\]"|"") (I had a slightly different ad text). You'll have to search the entry html for the part you want to replace and then you can use something like https://regex101.com set to Golang mode to test your regex. Commas are indeed a bit weird, there is currently no way to have a comma in your search or replacement string because miniflux thinks it is a delimiter for a new rewrite rule (source). You can ignore them with regex periods for now. This should probably be improved in the future. I think you should be able to do the second example yourself as an exercise :smile:

ghost avatar May 25 '21 23:05 ghost

Thank you so much!

Exercise done: replace("In diesem Artikel sind.*Preis nichts\."|"")

derflasher avatar May 27 '21 00:05 derflasher

With one of recent updates my rule for catching youtube urls in iframes from https://www.angrymetalguy.com/feed/ broke My rule was as follow:

replace("<iframe.*(https:\/\/.*youtub.*?)".*?<\/iframe><\/p>"|"<a href=$1>$1</a>")

is it possible to escape "

Tokariew avatar Oct 03 '21 14:10 Tokariew

@Tokariew, I have the same problem since a couple of releases. The tagesspiegel rewrite rule replace("\[.*\]"|"") is not working at all. Does anybody know how to fix it?

derflasher avatar Jan 08 '22 23:01 derflasher

@derflasher No, Idea, Tried to escape the " with \ but with no effect… In logs i only get <input>:1:9: invalid char escape

Unless someone found solution, or it's fixed, I will stay with pinned miniflux to version 2.0.32

Tokariew avatar Jan 12 '22 06:01 Tokariew

@derflasher i was able to rewrite my rules using . to match " in text but it seems that replace don't use from 2.0.32 full regex. Using \ to escape special characters or use to match for example \s is not possible

Tokariew avatar Mar 30 '22 12:03 Tokariew

Yes, @Tokariew, it still does not replace full regex 😒 I always stick to the latest version, so I'll text you one day 😂

derflasher avatar Jun 19 '22 20:06 derflasher