subword-nmt icon indicating copy to clipboard operation
subword-nmt copied to clipboard

learn_bpe.py error

Open unwritten opened this issue 3 years ago • 1 comments

I am running with a very big file: about 150M lines, disk size 60GB, --num-workers 10, and then : 'vocab += pickle.load(f)' in learn_bpe.py will report error: EOFError: Ran out of input.

tested on windows 10 os. I assume the 'tmp = tempfile.NamedTemporaryFile' introduce this? anyone has such experience?

thx

unwritten avatar May 12 '21 10:05 unwritten

thanks for reporting this issue.

@yimmon , this is related to parallel support you contributed; could you have a look?

rsennrich avatar May 16 '21 14:05 rsennrich