kiwix-js icon indicating copy to clipboard operation
kiwix-js copied to clipboard

Add a way to easily download ZIM archives

Open mossroy opened this issue 9 years ago • 16 comments

Like it's currently done in Kiwix. We should take extra care of files whith size >4GB, which need to be splitted to fit on FAT32 filesystems (which are still very common in smartphones/tablets)

mossroy avatar May 04 '15 07:05 mossroy

There would be at least 2 ways to do that :

  • implement an HTTP download manager (like recently added on Android) : with pause/resume etc
  • use a bittorrent client like https://github.com/feross/webtorrent (thanks @peter-x ). The main issue is that it uses a slightly different protocol, that does not seem to be handled by the Kiwix bittorrent seeder for now

mossroy avatar Jan 22 '16 22:01 mossroy

Although it doesn't include a download manager, on kiwix-js-windows I have implemented an AJAX-based repository browser. A link in configuration (with a warning that it will use an Internet connection) opens up a panel with the formatted contents of the index at http://download.kiwix.org/zim/ . Can browse forwards and backwards in repo.

Clicking on a .zim link opens a list of download sources with a warning about file size. Currently it warns (in red) if the file is larger than 200MB, but I could easily enough add a warning about files larger than 4GB as well.

Clicking on one of the sources will open a browser window and offer to download and save the file (except for the mirrorservice.org bug, so I've disabled that link). Download of large files is tested and works fine on Windows 10 Mobile and Windows 10 UWP (desktop/tablet), and the file is saved to Downloads, or wherever the user has chosen to save downloads in Windows settings. Downloads work in the background.

I don't know what effect this would have on other Mobile systems (i.e., using browser to manage downloads). I'll happily backport if this solution is of interest. Note that @sharun-s has a different (and elegant) solution, which is a list of links to certain popular files, based on pre-downloaded JSON data, so it works offline. However, since a connection is needed to download the file, I'm not sure there is an advantage to keeping the list of repos offline.

Screenshots below (after the panel is opened -- until then the panel is hidden).

image

And after clicking on a large ZIM archive:

image

Jaifroid avatar Aug 31 '17 09:08 Jaifroid

This looks cool, but I'm not sure it's the best technical way to do that. I was wondering how it differs from a simple hyperlink to http://download.kiwix.org/zim (where the user could browse the ZIM files and click on one to download it). It seems to add :

  • mirrors handling
  • ability to remove some servers/files (mirrorservice.org for example)
  • ability to add warnings based on the size of the file
  • layout more adapted to small screens
  • maybe some other pros? So yes, it is better than a simple hyperlink, even without a download manager that would allow to pause/resume/cancel.

Regarding filtering mirrorservice.org, this bug should be reported and fixed either on the kiwix server (by removing this mirror) or by mirrorservice.org themselves (by changing their content-type in the response, I suppose?). Regarding the technical way to retrieve the ZIM list, @kelson42 certainly has some directions to give. I know they try to use a common method for all the kiwix clients, but don't know if it's based on these .meta4 XML contents, or on something else.

mossroy avatar Aug 31 '17 20:08 mossroy

For me, the important thing is to keep the user on board and to ensure they don't see downloading a ZIM archive as "too" difficult. The browser on Windows devices (mobile / UWP desktop) does handle downloads intelligently in the background, and provides a central page from which the user can pause/resume downloads. I haven't tested it on a 60GB file, though, as I don't have enough free diskspace currently...... Adding download capabilities natively is a function of each app's native OS / ecosystem. There is a way to add download management to the app using UWP APIs, but I couldn't see what it added to the user experience, and it seemed a lot more complicated to code for, plus there would be no way to backport to any other app ecosystem.

I'm very happy to conform to any common methods or standards for download, if they can be implemented technically.

Jaifroid avatar Sep 01 '17 17:09 Jaifroid

Ah, and I've added a language selector, because there are a veritable mass of Wikipedia files in languages completely unknown to me, so it seemed best to give some indication of what they are, in the native language:

image

Jaifroid avatar Sep 01 '17 17:09 Jaifroid

I've asked @kelson42 to give some info on the technical way to list the available ZIM files. In any case, it's a much-needed feature; your code covers most of it, and it probably works on some other browsers/platforms. That's really promising.

mossroy avatar Sep 01 '17 19:09 mossroy

Have added some instructions (see screenshot) in case user is downloading a zipped (portable) version with split files (there is also a link to the "portable" section of the Kiwix repo on the unsplit ZIM meta4 download page). NB There's no problem downloading ZIP files from the mirrorservice, so the message about the download bug is not displayed and the link is enabled.

image

Jaifroid avatar Sep 02 '17 12:09 Jaifroid

All this development are really interesting and lots of congrats for doing them. I just want to say that even if we are going to provide soon an OPDS backend which might be easier to deal with (than the library.xml).

kelson42 avatar Sep 03 '17 16:09 kelson42

Over at kiwix-js-windows, I've now added a BitTorrent link to the panel of meta4 results. One thing to note is that the current implementation works fine in the UWP app on both mobile and desktop, but due to the cross origin block, it won't run in Edge or Firefox if those browsers are using the file:/// protocol. The widget detects the failure and offers to open a link to the repo in a new browser window instead (see screenshot).

In Internet Explorer, the XMLHttpRequest works cross-origin because the user is invited to give permission at the start of the browser session. It may also run in Chrome with the --allow-file-access-from-files flag set, but I haven't tested. It should run in any browser running on a localhost web server, but again I haven't tested this, as my main aim is to support UWP for now.

There may be other solutions -- library.xml ? -- but the current solution works fine in an app context.

image

Jaifroid avatar Sep 13 '17 21:09 Jaifroid

See also #489

kelson42 avatar Jul 22 '19 18:07 kelson42

Since the WebTorrent client (Node/JS) is still being actively developed, @kelson42 do you know if the Kiwix mirror servers support WebRTC as the P2P protocol?

Jaifroid avatar Jul 23 '19 09:07 Jaifroid

@Jaifroid I don't really understand the relation here between WebRTC and P2P? I can not answer your question sorry.

kelson42 avatar Jul 24 '19 11:07 kelson42

@kelson42 In order to solve this issue, we would need to embed something like the WebTorrent client (https://github.com/webtorrent/webtorrent) in Kiwix JS. Quoting from WebTorrent's information:

To make BitTorrent work over WebRTC (which is the only P2P transport that works on the web) we made some protocol changes. Therefore, a browser-based WebTorrent client or "web peer" can only connect to other clients that support WebTorrent/WebRTC.

It's pure JS, no plugins required, and lightweight. But clearly it wouldn't work if our servers don't support the protocol.

Jaifroid avatar Jul 24 '19 11:07 Jaifroid

@Jaifroid OK, I don't know, this should be tested.

kelson42 avatar Jul 24 '19 11:07 kelson42

So, I tested with two of our torrent files using https://btorrent.xyz/ , which is powered by WebTorrent. A peer is found, but nothing is downloaded from the peer. I guess, therefore, that mirrorserver unfortunately does not support BitTorrent over WebRTC.

Jaifroid avatar Jul 24 '19 14:07 Jaifroid

We have now access directly in the extension to https:library.kiwix.org image

This issue should probably be updated to better know what we want to achieve.

kelson42 avatar Apr 28 '24 15:04 kelson42

I don't think there's any more to do. The browser manages the download, unlike in the Electron app where I use Electron APIs to download. Downloading in the browser is highly robust, because it continues even if the user closes the app / browser. So I don't see the need for anything else.

Jaifroid avatar Apr 29 '24 09:04 Jaifroid