kiwix-js icon indicating copy to clipboard operation
kiwix-js copied to clipboard

When downloading a PDF, check the mime type as well as the extension of the file

Open Jaifroid opened this issue 2 years ago • 5 comments

We currently check for downloadable files by relying on the extension of the file. However, the Zimit port in Kiwix JS Windows has revealed a flaw (I think): we should also check the mimetype of the dirEntry. We can have PDF files that are just links to data, not apparently a file at all (at least this is revealed in the cheatography.com Zimit ZIM). Both SW mode and jQuery mode seem to fail in this situation.

This issue needs more investigation, but if my supposition is correct, then a fix should improve Kiwix JS more broadly, not just Zimit support. I have opened https://github.com/kiwix/kiwix-js-windows/issues/252 to investigate.

Jaifroid avatar Apr 25 '22 15:04 Jaifroid

It's true in jQuery mode. I don't think we can have a reliable way to know if a file should be downloaded or displayed. Because it depends not only on the MIME-Type, but also on the browser/OS configuration. Deciding that PDF and epub files should always be downloaded (like it's done currently) is probably not a very bad decision, though. It's certainly possible to use the MIME-Type instead of the file extension to know if a link points to a PDF/epub file or something else. But I would not invest time on that (in jQuery mode).

AFAIK, in ServiceWorker mode, there is no supposition based on file extensions. The MIME-type is read from the ZIM file and sent in an HTTP header. And we let the browser do its job.

mossroy avatar Apr 25 '22 16:04 mossroy

Have you seen this ticket? https://github.com/kiwix/kiwix-android/issues/2793

Wonder if kiwix-js is impacted... but wathecer if it is, this is really important and meanwhile maybe urgent to face properly this challenge and stop to make assumptions.

kelson42 avatar Apr 30 '22 13:04 kelson42

I've just checked that we can open the Wikipedia article "node.js", as well as the archlinux ZIM's "node.js", just fine in Kiwix JS (and Windows). So I don't think the assumptions there affect us.

Jaifroid avatar Apr 30 '22 13:04 Jaifroid

I don't think we can have a reliable way to know if a file should be downloaded or displayed. Because it depends not only on the MIME-Type, but also on the browser/OS configuration. Deciding that PDF and epub files should always be downloaded (like it's done currently) is probably not a very bad decision, though.

I agree it's not a bad decision. The problem is if the PDF "file" is just a URL link like A/pdf_documents/104 , then we would currently not treat it as a PDF and would try to display it, which would fail in jQuery mode at least because we wouldn't set the correct document MIME type. In SW mode it should work, as you say. Without a non-Zimit test case, it's a bit theoretical... Fix should just be a case of testing the dirEntry.getMimetype() function instead of relying on the URL ending in .pdf (or .epub).

Jaifroid avatar Apr 30 '22 14:04 Jaifroid

I don't think we can have a reliable way to know if a file should be downloaded or displayed. Because it depends not only on the MIME-Type, but also on the browser/OS configuration.

Agree!

I don't know what is technicaly possible. But, to my opinion, the normal behaviour - if a content can not be rendered directly in the browser (and assumption can be made about which mime-types are always properly handled) - would be to open a dialog allowing to either open it via a third party software (short list to be handled by the OS/Browser) or save it to a mass storage.

I have no problem if it works only in SW mode.

Here a very similar ticket for kiwix-desktop https://github.com/kiwix/kiwix-desktop/issues/697

kelson42 avatar Apr 30 '22 15:04 kelson42