kiwix-js icon indicating copy to clipboard operation
kiwix-js copied to clipboard

Add a setting to choose which backend to use (javascript or libzim wasm)

Open mossroy opened this issue 2 years ago • 3 comments

It should be automatically set to libzim wasm by default. There should be a warning displayed to the user if it's not supported, and fallback to our current javascript backend

For a first version, I'd recommend to use only the wasm binary generated by emscripten. It should allow us to more quickly release a first version with libzim backend for recent browsers: https://caniuse.com/wasm

We'll have time afterwards to consider if we try to compile libzim with asm.js too, to be able to support older browsers

mossroy avatar Jun 06 '22 14:06 mossroy

Interesting. Do you see this option as a drop-in replacement for our current backend? Can we use it (initially) the way we use our current readDirEntry(dirEntry).then(function readBinaryFile(fileDirEntry) { }) ......?

Jaifroid avatar Jun 06 '22 14:06 Jaifroid

I'm sure it won't be that easy.

But the prototype is encouraging: as you can see in https://github.com/kiwix/kiwix-js/pull/766/files#diff-484c9e8aa09e09b8eb1fafaf55e8eadcd0eaab230e38d02bb9440841bfa090a4R1579 , I could somewhat easily replace the backend. It can certainly be modified to support both backends.

Currently, this branch still uses our javascript backend for everything except browsing through the ServiceWorker: at least reading the ZIM metadata, finding the welcome page, searching, going to a random article AND all the jQuery mode

I suspect searching could be a more difficult part, because it has been optimized quite a bit, and the libzim API might not correspond to the way we handled pagination for example

Regarding the jQuery mode, I'd suggest to leave it as it is (on our javascript backend) to save some work (at least at the beginning). I don't see any technical reason why it could not be switched to libzim wasm too, but it seems to me that all browsers that support wasm also support ServiceWorkers, see https://caniuse.com/wasm and https://caniuse.com/serviceworkers

mossroy avatar Jun 06 '22 15:06 mossroy

I'm sure it won't be that easy.

It never is... But if we can start to use it experimentally in a (sort of) drop-in manner, then it will be much easier for us to develop it over a period of time, optimize it, in the same way that Service Worker mode was developed from a buggy start to production-ready code now. We have to manage the risk of proliferating "modes", I think.... (xz-zstd-asm, xz-zstd-wasm, libzim-wasm, libzim-asm).

I agree that we should work with wasm only first.

Jaifroid avatar Jun 06 '22 15:06 Jaifroid

Now that I have integrated search, it should be relatively easy to integrate getting BLOBs from the libzim version instead of our custom back end. I think it shouldn't be the default at first, but an experimental option (under Expert Settings) that a developer or expert user can turn on in order to experiment. The reason for this is that it is largely untested, and the prototype was producing memory leaks (#872). Integrating this as an Expert Setting would allow us to test more easily.

N.B. we won't be able to replace our current back end completely, because the WASM version works only with the latest Chrome, Edge and Firefox, and in particular it currently requires Atomic Operations, which are not available even in earlier WASM versions that were otherwise very stable (e.g. Edge Legacy), let alone in browsers that only support ASM. So, although we can easily compile to ASM, and the ASM version works well (and quite fast for search), it's not actually supported on older browsers unless I can find a way to turn off pthread support in the C++ code at compile time. See https://github.com/openzim/javascript-libzim/issues/17.

Jaifroid avatar Jan 01 '23 15:01 Jaifroid

@Jaifroid What are the browser most recent versions which are not able to work with javascript-libzim.

kelson42 avatar Jan 01 '23 16:01 kelson42

@kelson42 According to https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Atomics#browser_compatibility, the versions that work would be Chrome 68+, Edge 79+, Firefox 78+. Emscripten uses Atomics.load() and Atomics.add().

Of less relevance for our target apps (though it is of relevance for the PWA version), Chrome/Edge for Android does not work (not because Atomics are not supported, but possibly because of problems transferring the ArrayBuffer to the Web Worker), and Firefox for Android has a different issue that also affects our current backend: it tries to copy the entire ZIM File into memory, and fails with archives larger than a couple of gigabytes.

Jaifroid avatar Jan 01 '23 16:01 Jaifroid

@Rishabhg71 As discussed, we have a "working" prototype version in #766, but without any UI option to select between backends, and relatively limited (as per discussion above) in terms of what libzim is actually used for.

I think this issue could be tackled first in a "light" way. The current code in #766 is monkey-wrenched, so some (but not too much at this stage, see below) code re-organization would be needed to separate the pathways a bit more and to make use of libzim for article loading agnostic as to which mode the user has requested (Service Worker or JQuery).

The first step would be to add the UI option, so that we can more easily "road test" libzim reading. I've already uncovered a blocker with an internal WASM crash systematically produced on a number of ZIMs, seemingly (but not clear) due to lack of error handling when some ZIM assets are missing. See https://github.com/openzim/javascript-libzim/issues/64.

Note that in current main branch, libzim is already loaded and inititalized shortly after an archive is loaded (in JavaScript) by the code in zimArxchive.js, and indeed libzim is loaded as a dependency of the Selected Archive. I suggest not touching that for now, or search (which uses libzim automatically if it is supported). Just focus on the pathways for loading an article for the purposes of this issue, given that libzim is already loaded and initialized in any case.

The reason why this issue should be done in a "light" way is that the most logical next step would be #853, which would be a larger undertaking. But that would depend on fixing blocking issues with the libzim WASM first.

Jaifroid avatar Nov 10 '23 06:11 Jaifroid

@Jaifroid I have already started working on this issue. These are things that I have already accomplished

  • Setting up the project
  • Understanding the compilation steps
  • Compiling a simple "hello world" C++ program in WASM

I do have a few questions which I will ask later on. Here are the next steps which would be much better.

  • Re-organize the current javascript-libzim code
  • Code cleanup for API access of wasm code
  • easy setup and build
  • implementation in kiwix-js

adding more code or using the current code doesn't seem like a good idea to me It's bound to break some day. I have got the basics of the project working. the only thing I need to figure out now is compiling all the dependencies with zim.h

I will keep my changes on other branches and after everything is done it will be merged

Rishabhg71 avatar Nov 10 '23 09:11 Rishabhg71

@Rishabhg71 Compiling libzim with WASM is an extremely complicated endeavour, and the main work has already been done in https://github.com/openzim/javascript-libzim/, as I mentioned to you. Any work you do on improving the compilation should build on the existing WASM-Emscripten compilation and be submitted as a PR to https://github.com/openzim/javascript-libzim/.

While it's great that you're trying to understand the technology, I suggest you first start with this issue on Kiwix JS, as we discussed, and when that is done, we will have a firm base for road-testing the WASM and ASM builds produced by javascript/libzim. Only then will it become clear where the build code needs to be improved. That's my recommendation!

Jaifroid avatar Nov 10 '23 10:11 Jaifroid

I am using the code that is already written in javascript-libzim. I only wanted to compile a simple hello world to understand the basics. I understand it is a complicated task but please gi me 2-3 days then I will move back to kiwix-js

Rishabhg71 avatar Nov 10 '23 12:11 Rishabhg71

I am using the code that is already written in javascript-libzim. I only wanted to compile a simple hello world to understand the basics. I understand it is a complicated task but please give me 2-3 days then I will move back to kiwix-js

OK, understood! Of course, take the time you need to understand the codebase, and let me know via Slack if you have any queries I can help with.

Jaifroid avatar Nov 10 '23 13:11 Jaifroid

Fixed in #1160.

Jaifroid avatar Feb 21 '24 20:02 Jaifroid