Tom Morris

Results 686 comments of Tom Morris

I don't think anything super fancy or dynamic is needed. I would look into changing the logic of `split_dump()` to write files for editions, works, authors, and then everything else...

@jimchamp as I mentioned above: > Things like user pages and admin pages are already filtered when the initial dump is written, so you don't need to worry about them....

After downloading and looking at the original data set, it turns out that the character set decoding being done wrong on the input side. The output looks like it is...

And to follow up on my last comment, this only affects the standalone program, not the Hadoop processing. Rather than allowing JSoup do the character set determination, I decided to...

I suspect this might be a Github issue - https://github.com/orgs/community/discussions/86715

The description is "Wikidata batch editor, Wikimedia Commons mass upload tool" which seems odd and, as the description indicates, they've bundled the Wikimedia Commons Extension. As for the general question,...

repology seems like a useful site to me. A little harsh that they give bright red "fail" badges to versions which are only a few hours out of date, but...

You didn't scare me off. :-) I just needed some quiet time to review the agreement. Plus, as I suggested above, I've gone off on a different tack and implemented...

Clicking that link currently shows a single correct match candidate.

Clicking that link currently shows a single correct match candidate.