mwoffliner icon indicating copy to clipboard operation
mwoffliner copied to clipboard

Licensing terms in page footer are at best imprecise, or even plainfully wrong

Open benoit74 opened this issue 6 months ago • 1 comments

Currently the page footer of all mwoffliner ZIMs says (in English)

Image

(with a link to https://creativecommons.org/licenses/by-sa/4.0/)

But if we look at https://foundation.wikimedia.org/wiki/Policy:Terms_of_Use#7g, we will get:

If the text content was imported from another source, it is possible that the content is licensed under a compatible CC BY-SA license but not GFDL (as described in "Importing text," above). In that case, you agree to comply with the compatible CC BY-SA license and do not have the option to relicense it under GFDL. To determine the license that applies to the content that you seek to reuse or redistribute, you should review the page footer, page history, and discussion page.

In other words, for reuse (which we do), we might get CC BY-SA 4.0 license, or anything else compatible with it (public domain, other compatible license, ...) and there is absolutely no way to know this in an automated manner. So we have a significant chance that current phrase is at best imprecise, or even plainfully wrong.

And there are very higher chances it will be completely wrong for all non-Wikimedia mediawikis (this issue arises from https://github.com/openzim/zim-requests/issues/710#issuecomment-2967386302) where license might be "anything".

I don't know exactly how we should fix this, but I know I would definitely prefer to have no licensing information at all rather than a wrong one.

benoit74 avatar Jun 12 '25 20:06 benoit74

@benoit74 Yes and this is actually something which got very forgotten over time. Good that you have ooen an issue about it.

kelson42 avatar Jun 13 '25 03:06 kelson42

Considering we are currently adding a wrong license to most wikis, this issue should be a priority to get fixed.

Besides using a CLI argument to set the license for a ZIM, we can also use the wikis siteinfo API siprop=rightsinfo to get the license text and url. Even for Wikimedia sites using these values would be better as they include a localized license url.

https://de.wikipedia.org/w/api.php?action=query&format=json&formatversion=2&meta=siteinfo&siprop=rightsinfo

{
  "batchcomplete": true,
  "query": {
    "rightsinfo": {
      "url": "https://creativecommons.org/licenses/by-sa/4.0/deed.de",
      "text": "Creative Commons Attribution-Share Alike 4.0"
    }
  }
}

Note that both text and url can be empty depending on the wiki config:

https://wiki.kiwix.org/w/api.php?action=query&format=json&formatversion=2&meta=siteinfo&siprop=rightsinfo

{
  "batchcomplete": true,
  "query": {
    "rightsinfo": {
      "url": "",
      "text": "Creative Commons Attribution Share Alike"
    }
  }
}

Markus-Rost avatar Jun 27 '25 21:06 Markus-Rost

I'm not in favour of using an additional command line argument if not absolutly necessary. Please try to handle this automatically.

kelson42 avatar Jun 29 '25 10:06 kelson42