pyglossary
pyglossary copied to clipboard
Wiktextract : Japanese .jsonl to .index error with japanese part of english wiktionary
OS: Newest EndeavourOS updates (arch linux with calamares installer) Python-Setup: Micromamba with python 3.10 Shell: Fish shell
pyglossary was installed with pip. I did take the wiktionary .jsonl files from the kaikki.org site. It worked for the spanish part of the english wiktionary, but when I try it with the japanese part I get an error:
laptop02@laptop02-pc ~/Downloads> pyglossary kaikki.org-dictionary-Japanese.jsonl kaikki.org-dictionary-Japanese.index (py3)
[INFO] Writing to DictOrg file '/home/laptop02/Downloads/kaikki.org-dictionary-Japanese.index'
[ERROR] Exception while calling plugin's write function
Traceback (most recent call last):
File "/home/laptop02/fish/envs/py3.10/lib/python3.10/site-packages/pyglossary/glossary_v2.py", line 908, in _write
self._writeEntries(writerList, filename)
File "/home/laptop02/fish/envs/py3.10/lib/python3.10/site-packages/pyglossary/glossary_v2.py", line 842, in _writeEntries
for entry in self:
File "/home/laptop02/fish/envs/py3.10/lib/python3.10/site-packages/pyglossary/glossary_v2.py", line 393, in _readersEntryGen
yield from self._applyEntryFiltersGen(reader)
File "/home/laptop02/fish/envs/py3.10/lib/python3.10/site-packages/pyglossary/glossary_v2.py", line 407, in _applyEntryFiltersGen
for entry in gen:
File "/home/laptop02/fish/envs/py3.10/lib/python3.10/site-packages/pyglossary/plugins/wiktextract.py", line 156, in __iter__
yield self.makeEntry(json_loads(line))
File "/home/laptop02/fish/envs/py3.10/lib/python3.10/site-packages/pyglossary/plugins/wiktextract.py", line 208, in makeEntry
self.writeSenseList(_hf, data.get("senses")) # type: ignore
File "/home/laptop02/fish/envs/py3.10/lib/python3.10/site-packages/pyglossary/plugins/wiktextract.py", line 313, in writeSenseList
self.makeList(
File "/home/laptop02/fish/envs/py3.10/lib/python3.10/site-packages/pyglossary/plugins/wiktextract.py", line 653, in makeList
processor(hf, el)
File "/home/laptop02/fish/envs/py3.10/lib/python3.10/site-packages/pyglossary/plugins/wiktextract.py", line 612, in writeSense
self.writeSenseExamples(hf, sense.get("examples"))
File "/home/laptop02/fish/envs/py3.10/lib/python3.10/site-packages/pyglossary/plugins/wiktextract.py", line 392, in writeSenseExamples
self.writeSenseExample(hf, example)
File "/home/laptop02/fish/envs/py3.10/lib/python3.10/site-packages/pyglossary/plugins/wiktextract.py", line 369, in writeSenseExample
hf.write(text)
File "src/lxml/serializer.pxi", line 1660, in lxml.etree._IncrementalFileWriter.write
TypeError: got invalid input value of type <class 'list'>, expected string or Element
Traceback (most recent call last):
File "/home/laptop02/fish/envs/py3.10/lib/python3.10/site-packages/pyglossary/glossary_v2.py", line 908, in _write
self._writeEntries(writerList, filename)
File "/home/laptop02/fish/envs/py3.10/lib/python3.10/site-packages/pyglossary/glossary_v2.py", line 842, in _writeEntries
for entry in self:
File "/home/laptop02/fish/envs/py3.10/lib/python3.10/site-packages/pyglossary/glossary_v2.py", line 393, in _readersEntryGen
yield from self._applyEntryFiltersGen(reader)
File "/home/laptop02/fish/envs/py3.10/lib/python3.10/site-packages/pyglossary/glossary_v2.py", line 407, in _applyEntryFiltersGen
for entry in gen:
File "/home/laptop02/fish/envs/py3.10/lib/python3.10/site-packages/pyglossary/plugins/wiktextract.py", line 156, in __iter__
yield self.makeEntry(json_loads(line))
File "/home/laptop02/fish/envs/py3.10/lib/python3.10/site-packages/pyglossary/plugins/wiktextract.py", line 208, in makeEntry
self.writeSenseList(_hf, data.get("senses")) # type: ignore
File "/home/laptop02/fish/envs/py3.10/lib/python3.10/site-packages/pyglossary/plugins/wiktextract.py", line 313, in writeSenseList
self.makeList(
File "/home/laptop02/fish/envs/py3.10/lib/python3.10/site-packages/pyglossary/plugins/wiktextract.py", line 653, in makeList
processor(hf, el)
File "/home/laptop02/fish/envs/py3.10/lib/python3.10/site-packages/pyglossary/plugins/wiktextract.py", line 612, in writeSense
self.writeSenseExamples(hf, sense.get("examples"))
File "/home/laptop02/fish/envs/py3.10/lib/python3.10/site-packages/pyglossary/plugins/wiktextract.py", line 392, in writeSenseExamples
self.writeSenseExample(hf, example)
File "/home/laptop02/fish/envs/py3.10/lib/python3.10/site-packages/pyglossary/plugins/wiktextract.py", line 369, in writeSenseExample
hf.write(text)
File "src/lxml/serializer.pxi", line 1660, in lxml.etree._IncrementalFileWriter.write
TypeError: got invalid input value of type <class 'list'>, expected string or Element