firefox-translations-training
firefox-translations-training copied to clipboard
[tracking ERROR] Publication failed: Invalid config section: while scanning a simple key
From https://firefox-ci-tc.services.mozilla.com/tasks/CcBRvVfDT229U-iLBThC5w/runs/2:
[task 2024-05-02T00:40:40.350Z] [tracking ERROR] Publication failed: Invalid config section: while scanning a simple key
[task 2024-05-02T00:40:40.350Z] in "<unicode string>", line 493, column 1:
[task 2024-05-02T00:40:40.350Z] Loaded model has been created wi ...
[task 2024-05-02T00:40:40.350Z] ^
[task 2024-05-02T00:40:40.350Z] could not find expected ':'
[task 2024-05-02T00:40:40.350Z] in "<unicode string>", line 495, column 1:
[task 2024-05-02T00:40:40.350Z]
[task 2024-05-02T00:40:40.350Z] ^
Possibly of note, this is a run where I am testing autocontinuation after a spot termination. Run 1 published fine, run 2 downloaded the artifacts from run 1, and then started to train. I wonder if this is a general issue with publication when continuing an existing training?
@La0 @vrigal ok, we see this consistently, it seems the parser just can't parse:
[task 2024-05-09T20:18:17.301Z] [2024-05-09 20:18:17] [config] Loaded model has been created with Marian v1.12.14 2d067af 2024-02-16 11:44:13 -0500
https://firefox-ci-tc.services.mozilla.com/tasks/MuXYgqjUQ4G-HVTnSyzvgw/runs/1/logs/live/public/logs/live.log
We should make parser more reliable so that it tolerates new Marian logs it doesn't understand. This particular one does not look like a part of config, yes, but it still shouldn't fail.
Also, this blocks landing https://github.com/mozilla/firefox-translations-training/pull/580/files as we want to make sure W&B continues tracking correctly after the training is restarted.