James Addison
James Addison
Thanks again @davidism! And my apologies for yet again opening this from an organization account :|
There are a few differences between the contents of the two Google taxonomy files -- and not only differences in category description/translation. For example: the `en-GB` file contains category ID...
Thanks @teolemon - we have a few more errors too I think. The language code for Danish (`da`) is not the same as the country code for Denmark (`dk`) but...
I should have some time tomorrow to review the keys for this dictionary in more detail.
@teolemon OK; will the script be used again in future? If not, I'll close this.
Ok, thanks. I think it's worth correcting the entries in that case - I'll ideally also re-run the script after the fixups to find out whether that produced an effect/diff...
I've applied what I believe are corrections to the dictionary keys (incuding the addition of an entry for Canadian English -- yes, it is true that they do speak it...
There seems to be some amount of nondeterminism in the mappings; running the `python3 00_run_all.py` script repeatedly produces some differing modifications/mappings with each run.
Currently it's difficult to visually inspect some of the differences, partly because the `diff` output is across JSON files that do not have a sorted key order. I'll look into...
After adding sorting (08066ad9ec51b930cd5cecf3a6f8a1fd9f05b147) I have been able to re-run the Google Product taxonomy import process a few times and achieve a stable (e.g. deterministic / no changes to the...