warc2zim icon indicating copy to clipboard operation
warc2zim copied to clipboard

Add support for redirection in `meta http-equiv`

Open benoit74 opened this issue 10 months ago • 2 comments

It is possible to redirect a page with a meta http-equiv:

<meta http-equiv="refresh" content="3;url=https://www.mozilla.org" />

See https://developer.mozilla.org/en-US/docs/Web/HTML/Element/meta#http-equiv

So far, Zimit2 does not rewrite these links which are hence not leading to content inside the ZIM.

These links should be considered for rewrite as well.

The result will of course depend on wether the redirect target is present inside the WARC/ZIM or not.

Which in turns depends on how the WARC has been built (i.e. did the system which built the WARC waited long enough to have the redirect target captured as well?) but this is beyond the scope of this issue / warc2zim. warc2zim must just ensure to consider these URLs for rewrite.

benoit74 avatar Apr 18 '24 14:04 benoit74

I do not consider this is mandatory at all for Zimit2 release

benoit74 avatar Apr 18 '24 14:04 benoit74

I do not consider this is mandatory at all for Zimit2 release

Agreed. It was an issue with the www.ready.gov ZIMs, but it did not affect systems reading the ZIM with the Replay ServiceWorker, because AFAIK the redirect was caught by the Service Worker (I may be wrong about that).

For zimit2, it's fine to deal with this empirically.

Jaifroid avatar Apr 18 '24 14:04 Jaifroid

I do not consider this is mandatory at all for Zimit2 release

@benoit74 Not a regression in comparison to Zimit1? It‘s also a pretty common scenario.

kelson42 avatar May 25 '24 05:05 kelson42

Nope, not a regression AFAIK, it wasn't working in Zimit1 (hence the issue mentioned by Jaifroid on www.ready.gov)

benoit74 avatar May 25 '24 06:05 benoit74