parsec icon indicating copy to clipboard operation
parsec copied to clipboard

User-friendlier checks in PTG

Open therault opened this issue 1 year ago • 4 comments

Description

We just spent a few hours trying to debug a PTG, to finally find out that which count value passed by the user to the runtime was 0. The error is currently raised at the last possible moment, deep inside the communication engine, while it would have been pretty easy to raise it when we assign it at the generated code level.

Describe the solution you'd like

Add an assert when we set a count to 0 by calling a user function so that in debug mode the user can easily catch the error where it actually happens.

therault avatar Nov 21 '24 19:11 therault

Why would that be an issue at the higher level ? The application is allowed to send 0 bytes on a flow, it should then behave as a control dependency.

bosilca avatar Nov 21 '24 19:11 bosilca

What we want to check early is the case where user set [remote_layout=xyz, remote_count=0]

Properly marking a dependency as a CTL (potentially with a conditional guard) would achieve the same effect as converting count=0 as a control, so that presumably doesn't introduce a new feature.

Enabling count=0 however may likely mask a bug in the JDF program, and backtracking to where the original location of the bug was is not trivial.

abouteiller avatar Nov 22 '24 14:11 abouteiller

I'm not sure I understand the comment about marking the dependency at CTL, because a dependency cannot be a CTL only a flow can, and that can only be done at compile time. Sending data with 0-count is necessary to simultaneously track dependencies and inform successors that some predecessors will not provide any contributions to whatever is following.

bosilca avatar Nov 25 '24 15:11 bosilca

Scenario raised by George is legit. Not sure if we can distinguish between intentional use of 0-counts vs erroneous ones.

abouteiller avatar Mar 13 '25 15:03 abouteiller