fsdl-text-recognizer-2021-labs icon indicating copy to clipboard operation
fsdl-text-recognizer-2021-labs copied to clipboard

Lab 3 - base.py Acccuracy.update() Error:

Open just-eoghan opened this issue 3 years ago • 6 comments

System specs XPS 13 Intel(R) Core(TM) i7-8565U CPU @ 1.80GHz 1.99 GHz NVIDIA Geforce Rtx 2080 Super

Problem

Running the following: python training/run_experiment.py --max_epochs=10 --gpus=1 --num_workers=4 --data_class=EMNISTLines --min_overlap=0 --max_overlap=0 --model_class=LineCNNSimple --window_width=28 --window_stride=28

Results in the following error: ValueError: Probabilities in predsmust sum up to 1 accross theC dimension.

Solution

I managed to track down the error to the update function within the Accuracy class in base.py.

The offending line is: preds = torch.nn.functional.softmax(preds, dim=-1)

Where the dim=-1 paramater is causing this value error. Setting this to dim=1 solves the issue and allows training to take place.

I don't fully understand why this is the case or why this error presented in the first place. Any guidance would be appreciated!

just-eoghan avatar Apr 28 '21 11:04 just-eoghan

Stumbled accross this myself, just created a PR to fix it.

The reason for the problem is that when using the new models in Lab 3 like SimpleLineCNN or the LineCNN the predictions get a 3rd dimension, because it is a sequence of letters now. In Lab1/2 we were predicting single letters only.

The Accuracy Fix/Hack uses dim=-1, which works as long as there are only 2 dimensions (batch, class), but from lab3 does the softmax over the wrong dimension. (Dims are [128, 83, 32] (bs, num_classes, len_seq)). So setting the softamax to use dims=1 instead of dims=-1 makes it use the correct dimension of classes to "softmax over".

mprostock avatar Apr 28 '21 15:04 mprostock

Thanks for that Marc makes sense. I'll +1 your PR!

just-eoghan avatar Apr 28 '21 15:04 just-eoghan

Closing this now as there is a PR by @mprostock in progress.

just-eoghan avatar Apr 28 '21 15:04 just-eoghan

I understand your apporach, but it is common practice to leave tickets/issues open until resolved (or replied to by maintainer). My PR might not get accepted, they might choose to fix it in several other different ways. Until then it would be good to keep the issue open, so that other people can easily verify they are not alone with their problem and that this issue exists, until it is actually fixed in the code. So - could you reopen this issue?

mprostock avatar Apr 29 '21 09:04 mprostock

Point taken, reopened.

just-eoghan avatar Apr 29 '21 10:04 just-eoghan

Thanks. Having the same problem.

Caizifen avatar Feb 26 '22 07:02 Caizifen