node-red-contrib-queue-gate icon indicating copy to clipboard operation
node-red-contrib-queue-gate copied to clipboard

Sammachin filter

Open sammachin opened this issue 3 years ago • 6 comments

I've added a new control command filter

This allows the user to remove specific messages from the queue, it takes a path to the message property in msg.filter.

The use case I have for this is that I want to manage a queue of phone calls in a phone system so the call comes in and I put it on hold then store the ID in a queue and trigger the queue when an agent is availble. I needed a method to remove a call by ID from the queue if the caller hangs up while waiting.

The only issue I have that I couldn't see how to fix is to filter where the property forms part of an array but this doesn't feel like a significant problem at the moment.

Have added details to the README and also an Example flow.

sammachin avatar Apr 26 '21 09:04 sammachin

This is an interesting suggestion that deserves some careful discussion. I expect to release a bug-fix update of q-gate in a day or two, and would like to use that as the starting point for any further work. If your fork of the node meets your needs or if it requires improvements, I would appreciate hearing about it.

The behavior you describe is called reneging (queueing-theory speak for "dropping out") and is usually triggered by the length of time in the queue, although other events can be used. Some queueing nodes (for example, simple-message-queue) already implement this using a time-to-live (ttl) message property. If I were to implement a reneging behavior, I would certainly want to include ttl as well any more general trigger. (I would also use the standard term "renege" rather than "filter.")

As I understand your proposal, a message would renege when the node receives a control message of the form msg = {topic:"control", payload:"filter", filter:"id"}, where id is an identifier for the message(s) to be dropped.

The first difficulty I see is that the identifier property, filter, is fixed, preventing the user from using that property name for any other purpose. The property to be used can easily be defined in the edit dialog as, for example, the Queue Selector property in the m-queue node.

A second question concerns the exact nature of the identifier. You seem to want it to be an object like {identifier:"7"}, so that a message in the queue would be marked for deletion if it has msg.filter.identifier = 7. I don't immediately see the need for another level of message properties. Is this intended to allow messages to be specified by different properties as well as values?

Your example flow is very likely to cause confusion, because it uses msg.topic as the filter property. Many users of q-gate want to handle mqtt messages, and require msg.topic to be available for routing. It is also a bit odd that the control message ends up with msg.filter.topic = bad, which is likely to further confuse things. I'm sure we come up with examples that use a topic property only to route traffic or identify control messages.

Finally, I would not suggest using _msgid to identify messages to be removed from the queue. Whenever a message is cloned, the clone has the same _msgid as the original message, so it is quite possible for the queue to contain multiple messages with the same _msgid but different payloads or other properties.

drmibell avatar Apr 27 '21 02:04 drmibell

Thanks for taking a look, thats interesting to know the proper name, I went with filter as the main javascript function thats I'm using is filter (https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Array/filter) I can see TTL being a useful option here too although from a UX perspective I would suggest that was implemented separatly of my changes, so TTL can be set on a queue object itself and all messages get timestamped when they arrive in the queue, then checked before they are sent (and if they've expired they're dropped and another message is sent from the queue) Trying to implement realtime refreshing of the queue for TTL could be quite an overhead.

I take your point about msg.filter being fixed, how about using the same value as the command so {topic:"control", payload:"filter", filter:"id"} or {topic:"control", payload:"renege", renege:"id"} just to avoid having another option in the UI. This pattern of configurable properties isn't something I've seen in other nodes, I can see the advantage but it does add some complexity for users.

The point of the identifier in the filter propery is that is the key in the messag that you want to filter out by, so when messages go into the queue there is no need to set any property on which it later filter them you simply specoify that in the filter property. In my example flow I'm sending messages with 2 topics good and bad and then later filtering out those with the msg.topic set to bad. Equally if a message had a complext payload for example:

{"payload": {
	"uid": 1,
	"fname": "Bob",
	"lname": "Smith"
}}

You could filter out by the uid param using "msg.filter" : {"payload" : {"uid" : 1}} Topic is just an example of how you could choose to filter your messages, I've tried to make everything as configurable as the rest of the node (with the ommision of the filter property as mentioned above that I'll fix)

Yes _msgid isn't something that you would nornmally use to select messages you would likely choose to filter based on a known parameter that you had set just as a userId or callId, it was more a test to see if I could I'll remove that from the README and just leave it for users to discover if they want to try and use that?

sammachin avatar Apr 27 '21 07:04 sammachin

You have clarified several things, but I am still uncertain about how best to implement and document this feature. I'll respond to your comments first and then start asking questions.

TTL definitely needs to be set as a property of the messages entering the queue, but this can be done without timestamps or much extra overhead. Messages with a TTL can start a timer with setTimeout() and be deleted from the queue when the timer expires.

I adopted the pattern of configurable payloads for the control commands and a user-defined Control Topic in order to give the user maximum flexibility in defining his control messages. For example, a command can be distributed to many q-gate nodes, with each one responding differently according to its configuration. Additionally, the user may have little control over the structure or content of the messages being queued, so this pattern helps to maintain a strict isolation between control messages and the queueing traffic. The user who does not need this flexibility faces no added complexity, because the node is pre-configured with a set of fairly obvious default values for these properties.

I think I understand what your code is trying to accomplish, but I need to see exactly what it does and why, so that I will feel comfortable reading and maintaining it in the future, say six months or a year from now. My first question (more will follow) is why you seem to use getPath() recursively to drill down to the first property value that is not an object. Why not have the control message simply specify the key to use?

drmibell avatar Apr 29 '21 17:04 drmibell

I've made a coupld of changes to the branch: I've changed the term to renege from filter as per your suggestion. I've also made the value used in the control message for what to select the messages on be the same as the command, so if your control command is renege you set the value of the message you want to remove in msg.renege - I need to update the readme to reflect this still.

Regarding the sytax of the "filter" and the getPath function I've been thinking about this and we could avoid having getpath by requiring the user to specify the path of the property in the message and the values separately, so if you wanted to renege a message that had the property {"msg" : {"payload" : {"customerID" : 123}}} instead of setting msg.renege to {"payload" :{"customerID" : 123} you would set it to {"path" : "payload.customerID", "value" : 123} this would allow us to avoid using getPath to determine the path in the msg object but I'm not sure its clearer for users? Thoughts?

I've also been looking at the TTL for messages in the queue, it turns out that the implementation is quite separte from the original feature I created but I can see how they are similar in use, there's a few design questions that have come up I thought I'd run by you before committing:

How should TTL be set? is it a single value on the queues config defined in the editor as a number of ms (This is what I have so far) Or should the TTL be set as a property on the incomming message eg msg.ttl : 5000 with ttl being a configurable string, or should we offer both, with a default value in the queue and then if the property is set on the message that overides the queue defatult? I'm using 0 in the queue to mean no TTL is set and messages don't expire.

I'm also having to set a new id against the messages when they are enqued if we want to later remove them on a setTimeout, I'm using a value of _qttlid and generating a unique id similar to the _msgid as you noted we can't rely on using the _msgid to uniquely identify a message in the queue because messages could be cloned earlier in the flow. I'm currently not stripping this ID value when messages are dequeued its an _ value and should be pretty safe to leave and not conflict with anything, if a user had a real issue with the value being in the message they could choose to remove it themselves with a change node after the queue. Stripping it in the node would add a lot of lines as there are several places a message can be sent from the queue.

The expire function itself is a little more complicated than I'd like but everything after the filter method is in order to update the status with the new length of the queue when a message has been expired, and because this is occring in a setTimeout we need to pass in various references to the node and the current state, otherwise you have some edge cases where you could try to expire a message having switched the state of the queue into open/closed and then wrongly update the status.

The final issue I'm thinking about is what to do with expired messages, (and for that matter ones removed in a renege) The options are: Do nothing, they just silently vanish Use a node.log to report the ID in the nodered log to aid in debugging. Add an option for a second output port on the node and send the expired or reneged messages out there? Again I'd welcome your thoughts here, to me this doesn't feel as important with the renege as that is invoked on the message by an external event so you could use that to notify whatever other parts are needed but the TTL is the node doing something automatically so if feels like you might want to know about expriring messages and possibly route them to some kind of dead letter queue?

sammachin avatar May 09 '21 08:05 sammachin

Sorry to have let this discussion go dormant for so long; personal issues keep pushing it to the back of the queue :). If you could continue to use your fork of the node a while longer, I would like to consider some major changes to enable several features in addition to reneging. These would give the node most of the capabilities found in advanced queueing systems.

The basic idea is that by selecting a checkbox in the edit dialog, the user could enable the following features and modify the associated parameters:

  1. Reneging, both on command and because time-to-live has expired.
  2. Balking, when a message refuses to enter the queue or cannot, either because the queue is full or because its length is greater than a value specified by a message property.
  3. Bumping. A message enters the queue and forces another message to be dropped. (This is the behavior of a full queue with the "keep newest" option selected.)

A balk (sometimes "baulk" in the UK) can be useful in systems that do load balancing or other sorts of routing. It might be worth thinking about having an option to bump the newest message instead of the oldest one.

When the advanced features are selected, the node will get a second output, and all messages that cannot enter or that renege, expire, or are dropped will be sent to that output with an additional property indicating the reason for that routing.

Second output with added property msg[node.action]:

  1. Message reneges on command (action = 'reneged')
  2. Message expires (action = 'expired')
  3. Message balks (action = 'balked')
  4. Queue is full, default (action = 'refused')
  5. Queue is full, keep newest (action = 'bumped')
  6. Oldest message dropped (action = 'dropped')

One reason for wanting to group these capabilities together and require the user to explicitly enable them is that some of them violate a design rule that I had enforced up to this point. That is, the behavior of the queue is controlled only by its state and the control messages it receives, so that the messages being queued do not affect the node's actions and are not modified in any way. This should continue to be the case when the advanced features are disabled.

Responding to some of the questions you raised in your latest comment, I obviously like the idea of a second output for expired messages, etc., but I think it should be visible only when needed. Also, I think TTL should be a property of each message, with an option to define a default if that property is undefined. (Allowing a message property to override a default set in the node configuration has been frowned upon, but I have ignored that advice before.) As for cleaning up any id properties added to a message, I think it is worth the effort, if only to ensure that the message leaves the queue exactly as it arrived. Other cleanup, such as clearing timeouts, may be needed, so it might be worth writing a function that can be called as needed to do the housekeeping.

All this will require some effort, but I would rather work toward a fairly complete set of capabilities rather than make piecemeal changes. I can already see some implementation issues related to the compatibility of different commands, which makes me even more inclined to take on the whole project at once.

drmibell avatar Jun 04 '21 22:06 drmibell

ok totally understand about time!

I think for now I'll create a new node thats a simplified version of the queue aimed at my use cases which is mainly around things like queuing customer calls & requests rather than IOT type of messages/events. Happy if you want to close this PR or keep some of the code in another branch for later?

sammachin avatar Jun 05 '21 09:06 sammachin