warc2zim icon indicating copy to clipboard operation
warc2zim copied to clipboard

First access to warc2zim file doesn't correctly catch external links.

Open mgautierfr opened this issue 2 years ago • 8 comments

Behavior:

  • Open link : library.kiwix.org/viewer#courses.lumenlearning.com_en_all_2021-03/A/courses.lumenlearning.com/catalog/boundlesscourses.
  • Tiles (as Boundlees Accounting) are link to external website (as https://courses.lumenlearning.com/boundless-accounting)
  • Reloading (F5) doesn't help
  • Clicking on Boundless Courses button in the top bar make us go to the same page.
  • Links are related links pointing to the viewer.
  • From now one, clicking on the link given at the first step is ok.
  • To reset the behavior, you can launch dev tools and unregister from the service workers and reload. Then links are externals again.

Expected behavior:

  • Links should be relative (catched by service worker ?) at first visit.

mgautierfr avatar May 05 '23 16:05 mgautierfr

This looks like a kiwix-serve bug as the behavior that you are expecting is the one we've implemented. The template HTML that is included in all HTML entries makes a redirect to the home page (to install the service worker) which then redirects back.

You can confirm this by using the content endpoint at https://library.kiwix.org/viewer#courses.lumenlearning.com_en_all_2021-03/A/courses.lumenlearning.com/catalog/boundlesscourses

rgaudin avatar May 05 '23 16:05 rgaudin

@mgautierfr Just to be sure, would be good to confirm localy with a kiwix-serve compiled against dev version of libkiwix.

@rgaudin ZIM files is two years old, maybe something to consider.

kelson42 avatar May 05 '23 16:05 kelson42

@mgautierfr Just to be sure, would be good to confirm localy with a kiwix-serve compiled against dev version of libkiwix.

Seems to work as expected on a locally compiled kiwix-serve. What is strange is it works both with master and libkiwix 12.0.0.

Can you confirm you have the issue on your side with library.kiwix.org ?

mgautierfr avatar May 09 '23 08:05 mgautierfr

Can you confirm you have the issue on your side with library.kiwix.org ?

I have the issue on library.kiwix.org (which now uses 3.5.0). Are you sure you tested on a fresh browser (no SW)?

rgaudin avatar May 09 '23 09:05 rgaudin

Yes, I have unregister from the SW and try several times.

(Tested from localhost, I don't know if it may change the behavior)

mgautierfr avatar May 09 '23 10:05 mgautierfr

Downloaded the ZIM to check ; still not working for me

  • running docker run -it -v $PWD:/data -it -p 80:80 --rm ghcr.io/kiwix/kiwix-tools:3.5.0 /bin/sh -c 'kiwix-serve /data/*.zim'
  • using a fresh (History -> Clear recent history -> Everything -> Clear now) Firefox
  • accessing http://localhost/viewer#courses.lumenlearning.com_en_all_2021-03/A/courses.lumenlearning.com/catalog/boundlesscourses

Doesn't work ; SW is not registered.

rgaudin avatar May 09 '23 11:05 rgaudin

Same behavior confirmed on another ZIM at https://github.com/openzim/zimit/issues/258

I'm not 100% sure but I wonder if there may also be a kind of conflict when the same ZIM filename is opened on the same browser from two different hosts with different content.

benoit74 avatar Dec 12 '23 16:12 benoit74

If external links are not being rewritten in time, it implies that the issue is with wombat.js not being injected in time. Wombat is loaded by topFrame.html, and this should happen before the replay iframe is populated. Wombat adds overrides to the replay_iframe that detect when the src changes, so that it can intercept and rewrite the html coming from the ZIM. If there is a missing await/async somewhere, we are in trouble and are likely to have a race condition. Typical symptom (as you all know) is that it sometimes works and sometimes doesn't, or works on different hardware / browser versions, and not on others.

Jaifroid avatar Dec 14 '23 12:12 Jaifroid

This is known zimit1 issue, SW is properly registered only when opening the root URL, not a sub-URL. Will be fixed by removal of SW in zimit2 anyway, so not relevant anymore.

benoit74 avatar May 14 '24 13:05 benoit74