fuel icon indicating copy to clipboard operation
fuel copied to clipboard

Error when unpickling TextFile with text using encoding: "maximum recursion depth exceeded"

Open ArneNx opened this issue 7 years ago • 0 comments

from blocks.serialization import dump, load

dictionary = {'<UNK>': 0, '</S>': 1, 'this': 2, 'a': 3, 'one': 4}
dataset = TextFile(['example_data.gz'], dictionary, None, level='word',
                   encoding='utf8', preprocess=None).open()

with open('dumpfile', 'w') as f:
    dump(dataset, f)

with open('dumpfile', 'r') as f:
    y = load(f) 

In the example above, you would get an error from load(f), since this tries to unpickle codecs.StreamReader. At least for Python 2.7, it is a known issue, that this leads to an infinite recursion.

If you try the same thing without encoding or with an unzipped file, it will work without problems, since then codecs.StreamReader is not used.

Also in fuel version 0.1.1 this wasn't an issue since the reading was done differently. What was the motivation to switch to codecs.StreamReader? Can this be done without it? I would really appreciate it if there were a solution that would allow me to pickle the TextFile object without dropping the encoding or switching back to an older fuel version.

Thanks in advance!

ArneNx avatar Mar 08 '17 16:03 ArneNx