TITC
TITC
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe7 in position 2270: invalid continuation byte
> with io.open(file_path_dest,"r",encoding='ascii') still not work at python3.7 env before adjust ```python with open(temp_file, 'w') as fout: prepre = open(output_file, 'r').read().replace('\r', ' ') # delete \r # replace split, align...
Dear author, I have found some links to furthermore confirm the issue. Finally, I find a way to alleviate the issue`cuda runtime error (59)` through below code add in `sow/train.py`...
ok, as I think [here is the problem](https://github.com/tagoyal/sow-reap-paraphrasing/blob/b8476d9cbdf64ba8f29b945631791813c4f05c27/sow/train.py#L93) ``` model_config['postag_size'] = len(pos) ``` above should change to ``` model_config['postag_size'] = len(pos)+1 ``` --- reference: https://blog.csdn.net/Geek_of_CSDN/article/details/86527107
Here still are some things not make sense  pos class is 71, so I can solve the problem by add 1 to `model_config['postag_size']` but the dev set made by...
The reason of POS appears 71 is [here](https://github.com/tagoyal/sow-reap-paraphrasing/blob/b8476d9cbdf64ba8f29b945631791813c4f05c27/processing/convert_hdf5_sow.py#L70-L73) ``` for p in pos1 + pos2: if p not in pos_vocab.keys(): pos_vocab[p] = len(pos_vocab) rev_pos_vocab[pos_vocab[p]] = p ```  add new...
- case1 vocabulary and dev dataset from your shared google drive, but the training dataset is created by your provide [sample](https://github.com/tagoyal/sow-reap-paraphrasing/blob/b8476d9cbdf64ba8f29b945631791813c4f05c27/sample_test_baseline.txt) through your script. exist this error - case2 vocabulary...
> Hi, > This is an indexing error. > Are you using your own data or is this running on the data in the google drive? > Is this on...
I think that's good enough, maybe write a [class ](https://github.com/pytorch/pytorch/issues/15849#issuecomment-518126031)to wrap `torch.utils.data.DataLoader` then call dataset._shuffle() in it is also a method but add many redundant codes, not pretty sure. But...
can't wait to use the new data loader, do we directly merge it or write a custom `Dataloader` class, I can do that as a program practice.