zimit
zimit copied to clipboard
Unable to load the zim file, needs https to load the file
I recently created a zim file of a website and after copying it to my server, it says the page must be loaded via an HTTPS URL.
The message I got on the screen "This page must be loaded via an HTTPS URL to support service workers.
Try Loading HTTPS URL? "
zimit-created ZIM files uses ServiceWorker, a technology that requires being served via HTTPS or on localhost. That's the first thing on the README.
If you want to serve it on your server, you have to proxy your kiwix-serve behing an HTTPS-enabled server. You should be able to find guidance for this online.
I have the same issue. I run Ubuntu 21.10 with the latest Firefox. @rickgurmet, did you ever solve this?
Issue: Zim files I made do not open. I want to open the 6 zim files I have made so far to ensure I don't need to be doing something else before I go crawling more sites. The Wikimed zim file opens fine. This suggested to me that I/my computer are making non-working zim files. However, that seems dis-proven by:
a) the icon, site name, article count and media count for the zim files that won't open display fine on the list of available zim files located on http://192.168.0.120:8181/ b) the zim files created are both non-0 in size and different file sizes.
This is why I tried (iv) below.
I've tried:
i) The Kiwix desktop app without the "Kiwix Server" running
ii) The Kiwix desktop app with the "Kiwix Server" running
iii) The Firefox browser extension with jQuery (known issue - won't fix) -> PWA app link. This doesn't work offline, so I am abandoning this option in favor of the Kiwix app.
iv) making a new zim file with this repo and docker image while the "Kiwix Server" is on. I tried this option given a) and b) above and some ambiguity in @rgaudin's response about whether ServiceWorkers are needed for zim creation, zim reading, or both.
v) asking for help in the web-based chat. No one responded. I assume it's that no one present overlaps with my timezone.
vi) changing http://192.168.0.120:8181 into https://192.168.0.120:8181 or http://192.168.0.120
vii) changing http://192.168.0.120:8181 into localhost
viii) changing the zim file created to be owned by my linux user account
ix) changing the zim file created to be owned by my linux user account and chmod 775
None of those options allowed me to open the ZIM files I created.
@rgaudin:
- There's no
kiwix-serve
repository attached to the openzim github account. Are 'kiwix-serve', mentioned at the top of the README, and the Kiwix Server, that is start/stopped in the Kiwix desktop app, the same thing? That is, are they synonyms of each other? If not, where do I find 'kiwix-serve'? - Since Kiwix Server is a server, how come it isn't already serving behind and HTTPS-enabled server or proxying for us?
- Is there a Docker image available to run ServiceWorkers?
- How do you do local development if both IP address and localhost don't work?
- Is this going to involve doing something like creating a self-signed SSL certificate for my local machine?
- Given all the other facts presented and questions asked in this comment, what are the missing steps to this process:
- have a zim file, have Kiwix desktop open
- start Kiwix server. click to open the IP address in your browser ....
- zim file is opened, visible, and usable
asking for help in the web-based chat. No one responded. I assume it's that no one present overlaps with my timezone.
@alison985 Timezones might be an issue, but what do you mean by "web-based chat"? We don't have a chatbot or anything like it running anywhere.
As for testing zimit-generated files you do need to go through Kiwix-serve, the other options you list do not have service workers.
@alison985, thanks for raising that up. I'll try to describe what we got and what the limits are. Hopefully, you'll be able to help us draft a better wording for the README so that it's clear for any newcomer.
Regarding the chat issue, I believe most of us are on the Kiwix Slack channel, so you'd have better luck there.
- zimit-created ZIM files (actually warc2zim that zimit uses) are different from all other ZIM files. Assuming it will work because a non-zimit ZIM worked (like a wikipedia) is thus incorrect.
- identifying a zimit-created ZIM file is difficult.
- If it has been downloaded from /zim/zimit/, you know it is.
- Otherwise, you need to check for the ZIM's internal Tags. zimit-created have a
_sw:yes
tag on them. I am not sure how various readers exposes tags so here's an alternative method:
docker run -v $(pwd):/data openzim/zim-tools zimdump show --url M/Tags /data/musictheory.net_en_all_2021-12.zim
_ftindex:yes;_category:other;_sw:yes
- zimit-created ZIMs have different runtime requirements: the HTML stored in the ZIM and displayed when you access it relies heavily on JS code and this code relies on a browser technology call Service Workers (SW).
- SW spec imposes restrictions to web browsers when using them. The most important one is that the connection to the page using it must be secure. This means that it whould be served over HTTPs or over HTTP on
localhost
. -
kiwix-serve
is two things. It's the HTTP server feature built into libkiwix and it is a cli program calledkiwix-serve
hosted on kiwix-tools, itself mostly wrapping the server from libkiwix. - Readers (kiwix-desktop, kiwix macos, kiwix iOS, kiwix android, kiwix-js) don't rely on kiwix-serve. Readers renders ZIM files without going through HTTP which means supporting SW is additional work to implement. ATM, only kiwix-android supports it.
- Some readers (kiwix-desktop, kiwix-android) bundles a server mode that starts an HTTP server (kiwix-serve) for use either on the same device or from the LAN.
- Accessing a zimit ZIM thus requires either kiwix-android or kiwix-serve (either via cli, kiwix-desktop or kiwix-android).
- When accessing it using kiwix-serve (using a web browser), it must be either using
http://localhost
address (limited to same device) or using HTTPS. - kiwix-serve is HTTP-only so it is not capable of serving HTTPS by itself.
- You can use a reverse-proxy (caddy or nginx) to create an HTTPS server that internally forwards request to the HTTP-only kiwix-serve. That's what kiwix-hotspot does.
- Using a reverse-proxy, you need to either configure issued certificates for the browser to trust the server or use self-signed certificates that raises a warning in the client's browser.
- Clients using recent (sorry I don't recall the version number) Google Chrome (and most likely any chrome-based browser like Edge) will see the SW disabled if the server is using a self-signed certificate. AFAIK this is an unilateral move by Google that's not part of the spec.
tl;dr What works:
- Using kiwix-android
- Using a web browser on Android after starting the offspot mode and using
http://localhost/
address. - Using a web browser on Windows/Linux after starting the kiwix-serve mode of kiwix-desktop and using
http://localhost/
from the same computer. - Using Firefox/Safari accessing a reverse-proxied kiwix-serve via HTTPS with a self-signed certificate (using an IP-based or local domain address) – raises a warning in the browser
- Using a reverse-proxied kiwix-serve via HTTPS with an issued certificate on a real domain (requires maintenance on the server to update certificate when it expires)
This is not a satisfying situation from our point of view and we are looking at ways to make it simpler and more supported but haven't decided on a strategy not committed the resources yet.
I wrote an easier solution with wget for the meantime:
https://github.com/ballerburg9005/wget-2-zim
Thanks @ballerburg9005! Obviously, zimit's target is not wget-able websites but those that requires a JS VM to load content which is not possible when working off source only.
That said, I realize there's a missing link in our toolbox for those websites that are source-friendly. Surely zimit works for them too but zimit's replay constraints are unreasonable in those cases (and scraping is slower).
I like how you pushed it further with your options and post-processing. I'm afraid you've opened a pandora's box or work though 😅
Let me know if there's anything I can do to help. @kelson42 please take a look.
@kelson42 @rgaudin
I have tested it on a couple of sites, but there are still some major kinks to work out. So better wait another 2-3 days with the testing. I agree that it is just a makeshift solution and probably a bottomless pit. But still I believe it can work accepably and it will work in any Kiwix program in the meantime.
You should be able to find guidance for this online
So I've used the following for this mitmproxy -p 8081 -m reverse:http://127.0.0.1:8080
. I then had to start the local kiwix server because I'm using kiwix-desktop, tried https://localhost:8081
in the browser, and the content rendered, it looks very close to the live version of the site. Yes the browser will show the less fortunate self-signed certificate warning, but that custom certificate that mitmproxy is using can be installed in the browser.
This is not a satisfying situation from our point of view and we are looking at ways to make it simpler and more supported but haven't decided on a strategy not committed the resources yet.
I was reading issue 57 and also the docs on service workers. It turns out this is how service workers were designed to work, so it's a constraint that goes back to the standards/specs of service workers.
This issue has been automatically marked as stale because it has not had recent activity. It will be now be reviewed manually. Thank you for your contributions.