firefox-translations-training icon indicating copy to clipboard operation
firefox-translations-training copied to clipboard

[tracking ERROR] Publication failed: Invalid config section: while scanning a simple key

Open bhearsum opened this issue 9 months ago • 2 comments

From https://firefox-ci-tc.services.mozilla.com/tasks/CcBRvVfDT229U-iLBThC5w/runs/2:

[task 2024-05-02T00:40:40.350Z] [tracking ERROR] Publication failed: Invalid config section: while scanning a simple key
[task 2024-05-02T00:40:40.350Z]   in "<unicode string>", line 493, column 1:
[task 2024-05-02T00:40:40.350Z]     Loaded model has been created wi ... 
[task 2024-05-02T00:40:40.350Z]     ^
[task 2024-05-02T00:40:40.350Z] could not find expected ':'
[task 2024-05-02T00:40:40.350Z]   in "<unicode string>", line 495, column 1:
[task 2024-05-02T00:40:40.350Z]     
[task 2024-05-02T00:40:40.350Z]     ^

Possibly of note, this is a run where I am testing autocontinuation after a spot termination. Run 1 published fine, run 2 downloaded the artifacts from run 1, and then started to train. I wonder if this is a general issue with publication when continuing an existing training?

bhearsum avatar May 02 '24 00:05 bhearsum

@La0 @vrigal ok, we see this consistently, it seems the parser just can't parse:

[task 2024-05-09T20:18:17.301Z] [2024-05-09 20:18:17] [config] Loaded model has been created with Marian v1.12.14 2d067af 2024-02-16 11:44:13 -0500

https://firefox-ci-tc.services.mozilla.com/tasks/MuXYgqjUQ4G-HVTnSyzvgw/runs/1/logs/live/public/logs/live.log

We should make parser more reliable so that it tolerates new Marian logs it doesn't understand. This particular one does not look like a part of config, yes, but it still shouldn't fail.

eu9ene avatar May 09 '24 20:05 eu9ene

Also, this blocks landing https://github.com/mozilla/firefox-translations-training/pull/580/files as we want to make sure W&B continues tracking correctly after the training is restarted.

eu9ene avatar May 09 '24 20:05 eu9ene