transfer-learning-conv-ai
transfer-learning-conv-ai copied to clipboard
Train without Persona
Hi, Has anyone managed to train without persona taking into account only the context information?
Just change the data so that personality is "" and comment out:
# persona = [persona[-1]] + persona[:-1] # permuted personalities
- You mean the "personality" key data should look like:
"personality": [""] - What's the value of
personality_permutationsshould be?
@wise-east @sshleifer
Any luck in training this model without persona? I tried to do as @wise-east mentioned but getting the following error.
ERROR:ignite.engine.engine.Engine:Current run is terminating due to exception: can't multiply sequence by non-int of type 'float'
ERROR:ignite.engine.engine.Engine:Engine run is terminating due to exception: can't multiply sequence by non-int of type 'float'
Traceback (most recent call last):
File "train.py", line 267, in <module>
train()
File "train.py", line 259, in train
trainer.run(train_loader, max_epochs=args.n_epochs)
File "/Users/ankurrastogi/opt/anaconda3/lib/python3.7/site-packages/ignite/engine/engine.py", line 702, in run
return self._internal_run()
File "/Users/ankurrastogi/opt/anaconda3/lib/python3.7/site-packages/ignite/engine/engine.py", line 775, in _internal_run
self._handle_exception(e)
File "/Users/ankurrastogi/opt/anaconda3/lib/python3.7/site-packages/ignite/engine/engine.py", line 469, in _handle_exception
raise e
File "/Users/ankurrastogi/opt/anaconda3/lib/python3.7/site-packages/ignite/engine/engine.py", line 745, in _internal_run
time_taken = self._run_once_on_dataset()
File "/Users/ankurrastogi/opt/anaconda3/lib/python3.7/site-packages/ignite/engine/engine.py", line 850, in _run_once_on_dataset
self._handle_exception(e)
File "/Users/ankurrastogi/opt/anaconda3/lib/python3.7/site-packages/ignite/engine/engine.py", line 469, in _handle_exception
raise e
File "/Users/ankurrastogi/opt/anaconda3/lib/python3.7/site-packages/ignite/engine/engine.py", line 833, in _run_once_on_dataset
self.state.output = self._process_function(self, self.state.batch)
File "train.py", line 182, in update
loss = (lm_loss * args.lm_coef + mc_loss * args.mc_coef) / args.gradient_accumulation_steps
TypeError: can't multiply sequence by non-int of type 'float'
Any luck in training this model without persona? I tried to do as @wise-east mentioned but getting the following error.
ERROR:ignite.engine.engine.Engine:Current run is terminating due to exception: can't multiply sequence by non-int of type 'float' ERROR:ignite.engine.engine.Engine:Engine run is terminating due to exception: can't multiply sequence by non-int of type 'float' Traceback (most recent call last): File "train.py", line 267, in <module> train() File "train.py", line 259, in train trainer.run(train_loader, max_epochs=args.n_epochs) File "/Users/ankurrastogi/opt/anaconda3/lib/python3.7/site-packages/ignite/engine/engine.py", line 702, in run return self._internal_run() File "/Users/ankurrastogi/opt/anaconda3/lib/python3.7/site-packages/ignite/engine/engine.py", line 775, in _internal_run self._handle_exception(e) File "/Users/ankurrastogi/opt/anaconda3/lib/python3.7/site-packages/ignite/engine/engine.py", line 469, in _handle_exception raise e File "/Users/ankurrastogi/opt/anaconda3/lib/python3.7/site-packages/ignite/engine/engine.py", line 745, in _internal_run time_taken = self._run_once_on_dataset() File "/Users/ankurrastogi/opt/anaconda3/lib/python3.7/site-packages/ignite/engine/engine.py", line 850, in _run_once_on_dataset self._handle_exception(e) File "/Users/ankurrastogi/opt/anaconda3/lib/python3.7/site-packages/ignite/engine/engine.py", line 469, in _handle_exception raise e File "/Users/ankurrastogi/opt/anaconda3/lib/python3.7/site-packages/ignite/engine/engine.py", line 833, in _run_once_on_dataset self.state.output = self._process_function(self, self.state.batch) File "train.py", line 182, in update loss = (lm_loss * args.lm_coef + mc_loss * args.mc_coef) / args.gradient_accumulation_steps TypeError: can't multiply sequence by non-int of type 'float'
For anyone who may bump into this issue in future, I resolved it by adding .values() for the following line in train.py:
(lm_loss), (mc_loss), *_ = model(
input_ids, token_type_ids=token_type_ids, mc_token_ids=mc_token_ids,
mc_labels=mc_labels, labels=labels
).values()