wikiteam icon indicating copy to clipboard operation
wikiteam copied to clipboard

Select namespaces by name as well as number for dumpgenerator.py

Open jrbray1 opened this issue 5 years ago • 2 comments

I was hoping to do a backup of a wiki with content, Category and Template namespaces only, to reduce size, and select the namespaces by keyword, something like

dumpgenerator.py --api=https://hornblower.fandom.com/api.php --xml --curonly --namespaces 0,Template,Category

But it expects numbers not names. I could hack something together by parsing the results of https://hornblower.fandom.com/api.php?action=query&meta=siteinfo&siprop=namespaces&formatversion=2, but it would seem easier if dumpgenerator did this for you. Have you considered doing this?

jrbray1 avatar Aug 09 '20 11:08 jrbray1

jrbray1, 09/08/20 14:00:

it would seem easier if dumpgenerator did this for you. Have you considered doing this?

It might seem easier, but there are infinite possible namespace names, plus a dozen core ones each potentially translated in 400 languages. The name of the namespace makes sense only after we've contacted the API, or (even worse) screenscraped index.php output. The results will be unavoidably unpredictable, which is going to confuse people even more unless they're already well-versed in MediaWiki internals.

In other words, it's not clear to me what kind of user would be served by such a feature. We'll consider it if someone sends a patch, though!

nemobis avatar Aug 09 '20 11:08 nemobis

Not sure why the variety of namespace names is a problem, as https://www.mediawiki.org/wiki/Help:Namespaces talks about canonical namespaces in English and their foreign mappings. You could just support those canonical names and allow requests for 'Template' and 'Category'. This seems more robust that the user providing 10,14 and expecting that mapping is in place, but it would be just as easy with api parsing to allow a Frenchman to request --namespaces 0,Utilisateur, and not have to burrow into the api output to check what number that was.

Mediawiki documentation is all about names, not numbers.

jrbray1 avatar Aug 09 '20 11:08 jrbray1