ttrss_plugin-feediron
ttrss_plugin-feediron copied to clipboard
Fix Recursive fetch after new reformat option
After integration of feediron/ttrss_plugin-feediron#199 for having feature to reformat found subsequent article links, recursive multipage handling mode is failing.
This commit fixes the recursive loop.
Fixes feediron/ttrss_plugin-feediron#201
Please answer the following questions for yourself before submitting a pull request. YOU MAY DELETE UNUSED SECTIONS.
NOTICE!!!
All rule submissions should be done in the https://github.com/feediron/feediron-recipes repository.
Bugfix/Enhancement
- [x] Have you added an explanation of what your changes do and why you'd like us to include them?
- [x] Have you successfully ran tests with your changes locally?
Tests executed
Test 1
Configuration:
{
"type": "xpath",
"xpath": "div[contains(@class, 'article-content')]",
"multipage": {
"xpath": "nav[contains(@class, 'page-numbers')]\/span\/a[last()]",
"append": true,
"recursive": true
},
"modify": [
{
"type": "regex",
"pattern": "\/<li.*? data-src=\"(.*?)\".*?>\\s*<figure.*?>.*?(?:<figcaption.*?<div class=\"caption\">(.*?)<\\\/div>.*?<\\\/figcaption>)?\\s*<\\\/figure>\\s*<\\\/li>\/s",
"replace": "<figure><img src=\"$1\"\/><figcaption>$2<\/figcaption><\/figure>"
}
],
"cleanup": [
"aside",
"div[contains(@class, 'sidebar')]"
]
}
Testurl: https://arstechnica.com/gadgets/2024/05/all-the-ways-streaming-services-are-aggravating-their-subscribers-this-week/
Purpose: ensure, that recursive multipage handling is working with disabled reformat.
Test 2
Configuration:
{
"type": "xpath",
"xpath": "article",
"tags": {
"type": "xpath",
"xpath": "meta[@name='keywords']",
"split": ",",
"modify": [
{
"type": "replace",
"search": "\"\/>",
"replace": ""
}
]
},
"cleanup": [
"amp-analytics",
"amp-consent",
"amp-pixel",
"amp-ad",
"header",
"amp-font",
"a[@class='link-to-top']",
"div[contains(@class ,'amp-ad-container')]",
"div[contains(@class ,'social-sticky')]",
"footer",
"aside[@id='job-market']",
"aside[@class='aside__meta']",
"ul[contains(@class, 'social-tools')]",
"ol[@class='list-pages']",
"div[@amp-access='NOT subscriber' and text() = 'Anzeige']"
],
"multipage": {
"xpath": "ol[@class='list-pages' and not(@id='atoc_line')]\/li\/a[text() != '\u203a']",
"append": true,
"reformat": true
},
"reformat": [
{
"type": "regex",
"pattern": "\/\\.html$\/",
"replace": ".amp.html"
}
]
}
Testurl: https://www.golem.de/news/sony-ult-wear-im-vergleichstest-ein-erschwinglicher-kopfhoerer-der-begeistert-2405-184690.html
Purpose: Ensure, that reformat works in a non-recursive mode (all links are found and reformatted).
Test 2
Configuration:
{
"type": "xpath",
"xpath": "article",
"tags": {
"type": "xpath",
"xpath": "meta[@name='keywords']",
"split": ",",
"modify": [
{
"type": "replace",
"search": "\"\/>",
"replace": ""
}
]
},
"cleanup": [
"amp-analytics",
"amp-consent",
"amp-pixel",
"amp-ad",
"header",
"amp-font",
"a[@class='link-to-top']",
"div[contains(@class ,'amp-ad-container')]",
"div[contains(@class ,'social-sticky')]",
"footer",
"aside[@id='job-market']",
"aside[@class='aside__meta']",
"ul[contains(@class, 'social-tools')]",
"ol[@class='list-pages']",
"div[@amp-access='NOT subscriber' and text() = 'Anzeige']"
],
"multipage": {
"xpath": "ol[@class='list-pages' and not(@id='atoc_line')]\/li\/a[text() != '\u203a']",
"append": true,
"recursive": true,
"reformat": true
},
"reformat": [
{
"type": "regex",
"pattern": "\/\\.html$\/",
"replace": ".amp.html"
}
]
}
Testurl: https://www.golem.de/news/sony-ult-wear-im-vergleichstest-ein-erschwinglicher-kopfhoerer-der-begeistert-2405-184690.html
Purpose: Ensure, that reformat works in a recursive mode (while all links are found again on same page).
Fixes #201