PatchCore_anomaly_detection loss=nan

hello,how solve loss=nan?

Aug 27 '21 02:08 machine52vision

Hi, @machine52vision , have you solved this problem?

Sep 07 '21 01:09 XiaoPengZong

Hm. Correct me if i am wrong, but the net is not trained at all (just inference on pretrained wide_resnet50 to get embedding vectors), so no gradients have to be computed. That said, it doesn't matter if loss is NaN.

Sep 07 '21 06:09 SDJustus

thanks a lot!

Sep 07 '21 06:09 machine52vision

Hm. Correct me if i am wrong, but the net is not trained at all (just inference on pretrained wide_resnet50 to get embedding vectors), so no gradients have to be computed. That said, it doesn't matter if loss is NaN.

Hi. @SDJustus , I want to train my dataset with this code, not just inference. So I think it is matter if loss is Nan.

Sep 07 '21 06:09 XiaoPengZong

OK, so if you look at this code from train.py:
for param in self.model.parameters():
param.requires_grad = False you can see, that it is intended to not update model parameters during training. As you can read in the Paper, only the embeddings of a pretrained network are used to make further computations for a new dataset (such as minimax facility location and kNN for testing).
So again, no network weight updates are done during training. So Loss NaN is totally fine here.

Sep 07 '21 07:09 SDJustus

OK, so if you look at this code from train.py: for param in self.model.parameters(): param.requires_grad = False you can see, that it is intended to not update model parameters during training. As you can read in the Paper, only the embeddings of a pretrained network are used to make further computations for a new dataset (such as minimax facility location and kNN for testing). So again, no network weight updates are done during training. So Loss NaN is totally fine here.

OK, thanks, I know it.

Sep 07 '21 07:09 XiaoPengZong

Dig into the pl code pytorch_lightning\core\lightning.py, when prepare the dump info in each batch, there's such logic to assign the value to loss in function get_progress_bar_dict. if running_train_loss is not None: avg_training_loss = running_train_loss.cpu().item() elif self.automatic_optimization: avg_training_loss = float('NaN') check the definition automatic_optimization, def automatic_optimization(self) -> bool: """ If False you are responsible for calling .backward, .step, zero_grad. """ return self._automatic_optimization As there's no backward logic during trainning, automatic_optimization can be set to false to avoid set NaN to loss. I've modified the function configure_optimizers in train.py, there's no loss=NaN printed anymore. def configure_optimizers(self): self.automatic_optimization = False return None

Apr 20 '22 00:04 zhangjunli177

PatchCore_anomaly_detection PatchCore_anomaly_detection copied to clipboard

loss=nan

PatchCore_anomaly_detection
PatchCore_anomaly_detection copied to clipboard