fastNLP icon indicating copy to clipboard operation
fastNLP copied to clipboard

fastNLP.core.utils._CheckError

Open heng3366 opened this issue 3 years ago • 3 comments

复现falt,fastnlp用0.5.0版本的,python3.8,torch1.7,ubuntu 出现如下错误: Epoch 1/100: 1%|▌ | 955/95600 [01:01<1:24:04, 18.76it/s, loss:56.88514]/home/ai998/.conda/envs/nlp/lib/python3.7/site-packages/torch/optim/lr_scheduler.py:156: UserWarning: The epoch parameter in scheduler.step() was not necessary and is being deprecated where possible. Please use scheduler.step() to step the scheduler. During the deprecation, if epoch is different from None, the closed form is used instead of the new chainable form, where available. Please open an issue if you are unable to replicate your use case: https://github.com/pytorch/pytorch/issues/new/choose. warnings.warn(EPOCH_DEPRECATION_WARNING, UserWarning) Traceback (most recent call last):
File "flat_main.py", line 801, in trainer.train() File "/home/ai998/.conda/envs/nlp/lib/python3.7/site-packages/fastNLP/core/trainer.py", line 613, in train self.callback_manager.on_exception(e) File "/home/ai998/.conda/envs/nlp/lib/python3.7/site-packages/fastNLP/core/callback.py", line 309, in wrapper returns.append(getattr(callback, func.name)(*arg)) File "/home/ai998/.conda/envs/nlp/lib/python3.7/site-packages/fastNLP/core/callback.py", line 505, in on_exception raise exception # 抛出陌生Error File "/home/ai998/.conda/envs/nlp/lib/python3.7/site-packages/fastNLP/core/trainer.py", line 609, in train self._train() File "/home/ai998/.conda/envs/nlp/lib/python3.7/site-packages/fastNLP/core/trainer.py", line 668, in _train loss = self._compute_loss(prediction, batch_y).mean() File "/home/ai998/.conda/envs/nlp/lib/python3.7/site-packages/fastNLP/core/trainer.py", line 776, in _compute_loss return self.losser(predict, truth) File "/home/ai998/.conda/envs/nlp/lib/python3.7/site-packages/fastNLP/core/losses.py", line 339, in call loss = self.get_loss(**pred_dict) File "/home/ai998/.conda/envs/nlp/lib/python3.7/site-packages/fastNLP/core/losses.py", line 334, in get_loss raise _CheckError(check_res=check_res, func_signature=_get_func_signature(self.get_loss)) fastNLP.core.utils._CheckError: Problems occurred when calling LossInForward.get_loss(self, **kwargs) missing param: ['loss(assign to loss in LossInForward'] 没想明白怎么loss就丢失了,请问怎么解决

heng3366 avatar Apr 19 '22 07:04 heng3366

看报错是由于model返回的dict中没有loss。

yhcc avatar Apr 19 '22 08:04 yhcc

我也报了这个错,在每个epoch开始前加上self.model.train(),就跑通了

JY-Ren avatar Jun 29 '22 07:06 JY-Ren

这样话推测可能是由于代码forward中有使用self.training这个属性来判断当前是否是inference,如果是self.training为True和为False的时候,走的逻辑不一样。而手动调用self.model.train()应该是将self.training设置为True了。

yhcc avatar Jun 30 '22 03:06 yhcc