NeMo
NeMo copied to clipboard
[NLP] [question/bug?] `train_loss` falling, whereas `val_loss`, `punct_f1`, `capit_f1` all increasing
@PeganovAnton Training on multiple GPUs I'm noticing that train_loss
is decreasing, and f1
scores are increasing, but so is val_loss
. Is val_loss
the right metric to be monitored? Would mean f1
score be better suited?
What does logits_ndim
in https://github.com/NVIDIA/NeMo/blob/fea3775c00adfacfe0a414dea15544abc96db8dc/nemo/collections/nlp/models/token_classification/punctuation_capitalization_model.py#L126
stand for? Is this initialisation still correct if one changes the number of classes, i.e. includes support for additional punctuation marks? What if the numbers of classes for punctuation and capitalisation differ?
data:image/s3,"s3://crabby-images/db451/db45168cb65f100e6986983eecf7bdb450e39670" alt="image"
data:image/s3,"s3://crabby-images/7f44e/7f44eeda83e350a6f67d044158365453a0aa1b86" alt="image"
data:image/s3,"s3://crabby-images/41643/41643c29e841fbfb143fdd1611f922eb1d9b493f" alt="image"
data:image/s3,"s3://crabby-images/e70e4/e70e410c91ca6ae8dd1895eeff2eedae243e8a33" alt="image"
Environment details
pytorch:22.06-py3
+ nemo:1.10.0
+ PR #4553 patch
Hi @itzsimpl ! Sorry for the late response.
- I also remember occasions when validation loss increases while validation F1 increases. I suggest using F1 as monitored metric.
-
logits_ndim
is a dimensionality of the logits tensor. You may increase number of punctuation marks without changing this parameter. - You may add any number of new punctuation characters if they follow preceding word and then followed by space, e.g. semicolon, colon, exclamation mark. Though I doubt, that hyphen surrounded by spaces or opening parentheses will work correctly during inference (
add_punctuation_capitalization()
method). - You may add more capitalization labels, though you will need to modify
add_punctuation_capitalization()
.
This issue is stale because it has been open for 30 days with no activity. Remove stale label or comment or this will be closed in 7 days.
This issue was closed because it has been inactive for 7 days since being marked as stale.