fuel
fuel copied to clipboard
Error when unpickling TextFile with text using encoding: "maximum recursion depth exceeded"
from blocks.serialization import dump, load
dictionary = {'<UNK>': 0, '</S>': 1, 'this': 2, 'a': 3, 'one': 4}
dataset = TextFile(['example_data.gz'], dictionary, None, level='word',
encoding='utf8', preprocess=None).open()
with open('dumpfile', 'w') as f:
dump(dataset, f)
with open('dumpfile', 'r') as f:
y = load(f)
In the example above, you would get an error from load(f)
, since this tries to unpickle codecs.StreamReader
. At least for Python 2.7, it is a known issue, that this leads to an infinite recursion.
If you try the same thing without encoding or with an unzipped file, it will work without problems, since then codecs.StreamReader
is not used.
Also in fuel version 0.1.1 this wasn't an issue since the reading was done differently.
What was the motivation to switch to codecs.StreamReader
? Can this be done without it?
I would really appreciate it if there were a solution that would allow me to pickle the TextFile
object without dropping the encoding or switching back to an older fuel version.
Thanks in advance!