benoit74
benoit74
Fix #352 Changes in logic around finding the ZIM illustration: - build a sorted list of potential icons to use (instead of just a "random" one) - prefer to use...
We need a fuzzy rule to properly rewrite https://iranwire.com/questions/detail/1723?&_=1721804954220 (dynamic URL built by JS) to https://iranwire.com/questions/detail/1723 (real asset URL present in the WARC/ZIM, i.e. we need to drop the weird...
In dynamic URL rewriting (in JS with wombat), all URLs are rewritten. This is a fair assumption because in most cases the associated resources have also been automatically fetched at...
The scenario has been encountered on https://ir.voanews.com, see https://github.com/openzim/zim-requests/issues/833#issuecomment-2203635680 Scenario is as follow: - we want to rewrite image URLs with fuzzy rules so that they are capable to adapt...
This issue is a placeholder for what looks like a potential enhancements warc2zim might need to implement as some point in the future (typically in a 3.x version). It is...
This issue serves as a checklist for the release event. - [ ] Secure the CI is green on git `main` branch - [ ] Create a ZIM of the...
Currently, fuzzy rules are configured in a YAML (/JSON) file and transformed into code. Mid-term goal is to share these rules with WebRecorder team and other contributors. This probably means...
See https://github.com/webrecorder/wabac.js/pull/182#issuecomment-2185726884 for the start of a discussion about the fact we might not need DS rewriting, it looks like this is already done in the crawler at crawl time.
Fix #293 Fix #239 Changes: - `isSW` is set to `false`, as recommended by Ilya in https://github.com/webrecorder/wombat/issues/155#issuecomment-2183191941 Main concern with merging this is that this breaks the Youtube player, so...
See https://github.com/openzim/zim-requests/issues/6 and https://dev.library.kiwix.org/viewer#hackteria.org_2024-06