cylc-flow icon indicating copy to clipboard operation
cylc-flow copied to clipboard

data_store: fix traceback

Open oliver-sanders opened this issue 10 months ago • 6 comments

Traceback spotted in the wild.

Found this traceback in a scheduler log. It happened immediately after workflow restart.

Check List

  • [x] I have read CONTRIBUTING.md and added my name as a Code Contributor.
  • [x] Contains logically grouped changes (else tidy your branch by rebase).
  • [x] Does not contain off-topic changes (use other PRs for other changes).
  • [x] Applied any dependency changes to both setup.cfg (and conda-environment.yml if present).
  • [ ] Tests are included - no unreproducible
  • [ ] Changelog entry included - no unreproducible and unreported
  • [x] Cylc-Doc pull request opened if required at cylc/cylc-doc/pull/XXXX.
  • [x] If this is a bug fix, PR should be raised against the relevant ?.?.x branch.

oliver-sanders avatar Feb 13 '25 14:02 oliver-sanders

Ok, we don't know why this is happening, but it's easy enough to defend against.

oliver-sanders avatar Feb 19 '25 11:02 oliver-sanders

Hmmm, I've just been shown a cylc tui traceback where first_parent was None.

I wonder if it's the same root cause as this issue.

@dwsutherland, any ideas about how a task or family could end up with no first_parent?

oliver-sanders avatar Mar 21 '25 11:03 oliver-sanders

any ideas about how a task or family could end up with no first_parent?

Perhaps with orphans?

The only other way I can think of is..: If the node is pruned but then reintroduced somehow bypassing the normal creation (setting of static fields), like an update is applied post prune.. However, updates should be ignored if there's no node added or existing.. and any added/existing should have called a populated field..

We really need to be able to reproduce..

dwsutherland avatar Mar 25 '25 04:03 dwsutherland

We really need to be able to reproduce..

I can reliably reproduce missing first_parent elements in cylc tui, however, the example is very complicated.

All I can say so far:

  • Some families arriving with no first_parent elements.
  • It would appear that I (partially) defended against this in cylc tui.
  • This can be seen both in cylc tui (direct from scheduler) and cylc gui (via cylc-uiserver).
  • I haven't found any pattern to the affected families yet. The one's I've spotted so far all inherited (implicitly) from root.

oliver-sanders avatar Mar 25 '25 09:03 oliver-sanders

Some families arriving with no first_parent elements.

Well, root would always have no first_parent? I assume you mean others..

I haven't found any pattern to the affected families yet. The one's I've spotted so far all inherited (implicitly) from root.

I wonder if this is similar to the tasks without state problem: https://github.com/cylc/cylc-flow/issues/6567

dwsutherland avatar Apr 01 '25 03:04 dwsutherland

Well, root would always have no first_parent? I assume you mean others.

Yes.

I thought I had pinned this down to generate_ghost_family as it is possible for a family to not be assigned a firstParent here, however, these families don't seem to match the ones coming through GraphQL with missing firstParent fields, so I think I've misunderstood.

oliver-sanders avatar Apr 01 '25 08:04 oliver-sanders

Don't like the fix, don't have time for this right now, can't reproduce the issue

oliver-sanders avatar Aug 04 '25 12:08 oliver-sanders