Seq2Seq-Vis
Seq2Seq-Vis copied to clipboard
Prepare and Run Own Models gives errors in extract_context.py h5py broadcasting
Hi,
Great work on the repository and the visualizations. This is very much useful. I had to create version specific pytorch models for this (using the custom install procedure) and ran into issues while preparing the data. Running the extract_context.py as shown below give h5py broadcasting error. It seems to be a known issue with h5py library.
Command Used:
python extract_context.py -src data/src-train.txt -tgt data/tgt-train.txt -model demo-final_acc_5.53_ppl_4304.81_e1.pt -batch_size 10
Fix Used: Changing the embedding dimensions to match the pytorch models src and tgt embedding sizes worked fine.
Change on Line 169: size from 100
to 500
cstarset = f.create_dataset("cstar", (opt.batch_size,max_tgt_len,500), ...
Change on Line 178: size from 100
to 500
encoderset = f.create_dataset("encoder_out",(opt.batch_size,max_tgt_len,500), ...
Change on Line 185: size from 100
to 500
decoderset = f.create_dataset("decoder_out",(opt.batch_size,max_tgt_len,500), ...
Hoping the new version release will have more support for newer OpenNMT-py releases/models.
Thanks !
Mohammed Ayub
Hi Mohammed. Thank you for the fix suggestions.
no problem :) @HendrikStrobelt
Better fix would be to add it as a variable for embedding size(i think), so it would work irrespective of how you built your model. (just happened to be that I trained the model for 500 embedding size which is default in the training documentation)