In-order error reporting
Using the latest build (but this problem probably isn't new) I'm running into the following situation while trying to make cncjs-shopfloor-tablet report controller-issued errors correctly. Suppose we are using GRBL with the following GCode
JUNKJUNK
G0 X0 Y0
%wait
(The %wait is auto-generated). The sender blasts the first two lines to GRBL quite quickly and enters sender hold state on the %wait. GRBL processes the JUNKJUNK line and sends an error message. That causes the workflow to pause and the workflow pause handler calls sender.hold, passing it the error message. But the sender is already in the hold state so it ignores the error message.
I can work around this problem by parsing serialport:read messages, but it would be better if errors could be reported at the workflow state transition level so they are synchronized better.
Thoughts/ideas?
Here is a short program that triggers several problems with GRBL state management (similar problems exist with all controllers).
N1 g0 x0 y0
N2 g1 x10 y10 f60
N3 junk
N4 g1 x10 y0 f40
N5 m0
N6 g1 x0 y0 f40
Here is what should happen ideally:
- During the execution of line N2, which takes about 15 seconds, the user should be able to manually pause via the pause control.
- After N2 completes, an error message about the bad command "N3 junk" should appear, with resume-or-stop controls enabled.
- After resuming, N4 should execute, and the user should again be able to manually pause during its execution.
- After N4 completes, the "M0 Program Pause" notification should appear, with resume-or-stop controls.
- N6 should run (with pause enabled) and upon completion, the controls should return to stop-or-close .
Instead, this happens:
- When the program starts, the "M0 Program Pause" notification appears instantly, while the N0 and N2 movement is still in progress. The error state from N3 is lost.
- The controls go instantly to resume-or-stop, so you cannot manually pause (although you can, confusingly, click resume, even though it is running. and then click pause).
- Since the N3 error is lost, motion continues from N2 to N4 without stopping for user intervention.
- When N4 completes, cncjs shows Hold and you can click resume to finish the program.
- When the program completes, instead of the controls returning to start-or-close, only pause is enabled. To get back to start-or-close, you must click pause then stop.
- During the N2 and N4 motion with the M0 message displayed and the controls at resume-or-stop, if you click resume before N4 is complete, pause is enabled while motion continues, then when N4 is finished, cncjs shows Hold and motion ceases at the M0 hold point, but only pause is enabled. (You can get to resume-or-stop by first clicking pause (with no motion)).
I have a version of cncjs-shopfloor-tablet that seems to work correctly, but it is complicated, involving looking at serialport:read, workflow:state, sender:status, and controller :state in controller-dependent ways. Marlin is hampered by the lack of a reliable queue-jumping feedhold, but still works more or less.
I am working on a server mod that should make this much easier. The idea is introduce a new workflow:held message that is sent when the controller hits the idle or hold state. workflow:held's payload reports the reason for that particular hold (error, M0, M6, user pause, etc) in the correct order. With that mod, it should be possible to implement the workflow controls with simple controller-independent code. Hopefully I can avoid any changes to existing message sequences, so the UIs can migrate to the new scheme as time permits. At least that is what it says in the plan :-)
I'm thinking of a way that UI can display error message and M0/M1/M6 messages independently with multiple notification popups, and M0/M1/M6 messages will be shown only when the count of received acks is equal to the count of lines sent (i.e. received === sent). This can also be achieved by comparing sent and received state from the UI.
Otherwise, it is also necessary to modify Workflow.js to support pause-on-errors and general pause actions (M0/M1/M6, user pause), so it won't block each other.
Above is just my rough idea. You can create a pull request or create a new branch if you already have a PoC. Hopefully we will find a better approach to report errors in expected order.
I am making progress, and it does involve a mod to Workflow.js. The problem with using received===sent is serial buffering and planner queueing inside the controller. The stuff you have sent can be pretty far ahead of the machine. For an example of why I am so fixated on synchronized operation, see this video showing what can happen when a machine goes wrong: https://www.facebook.com/watch/?v=2294066584184740