THUMT
THUMT copied to clipboard
Do you have an instruction manual for the pytorch version?
Some commands for additional parameters are not working in the pytorch version, so, do you have an pytorch-oriented manual?
For example:
--parameters=batch_size=15000,device_list=[0,1],update_cycle=2,train_steps=2000000,keep_checkpoint_max=5,shared_embedding_and_softmax_weights=True,shared_source_target_embedding=True
raise ValueError("Could not parse hparam %s in %s" % (name, values)) ValueError: Could not parse hparam shared_embedding_and_softmax_weights in batch_size=15000,device_list=[0,1],update_cycle=2,train_steps=2000000,keep_checkpoint_max=5,shared_embedding_and_softmax_weights=True,shared_source_target_embedding=True
The initial loss is inf, and turned to normal after around 200 steps
In the above example, you should set shared_embedding_and_softmax=true
instead of shared_embedding_and_softmax=True
. The document of PyTorch implementation will be uploaded soon. We have tested our implementation on several datasets, but we do not observe inf
loss problem.