aedes-persistence-redis icon indicating copy to clipboard operation
aedes-persistence-redis copied to clipboard

what is outgoing:undefined in redis

Open b-sirius-a1v opened this issue 2 years ago • 15 comments

System Information

  • Aedes: 0.47.0
  • aedes-persistence-redis: 9.0.0
  • NodeJS: v16.14.2
  • OS: MacOS 11.2.2

I'm using a translator

Describe the bug

outgoing:undefined is generated when broker is restarted with client of qos=1 When a client goes offline and back online, the broker no longer provides a buffer for the client

To Reproduce Steps to reproduce the behavior:

  • Launch Broker
  • c1 subscribed qos=1
  • Turn the broker off and on
  • published by c2 qos=1
  • outgoing:undefined has been created in redis
  • disconnect c1
  • published by c2 qos=1
  • offlineCount created
  • connect c1
  • Buffer is not delivered for c1

b-sirius-a1v avatar Jun 09 '22 11:06 b-sirius-a1v

Does downgrading to 8.0.1 fixes the issue?

robertsLando avatar Jun 09 '22 12:06 robertsLando

@robertsLando

Thank you very much!!! Your help worked for me

I have some questions

  • Has this been completely resolved by a downgrade?
  • Can I continue to use this version in a production environment?
  • maxSessionDelivery appears to be limiting packets to be sent to clients in outgoing, is it correct?
  • If yes, but maxSessionDelivery was sending packets beyond my expectation

b-sirius-a1v avatar Jun 10 '22 08:06 b-sirius-a1v

@seriousme I think we may have a regression here, maybe introduced by the sanity check on keys?

robertsLando avatar Jun 10 '22 15:06 robertsLando

It will be some sort of regression if it works on 8.0.1, however encodeURIcomponent is present in NodeJS since version 0.10.

@b-sirius-a1v if you try the following:

node -e "console.log(encodeURIComponent('blah'))"

Where 'blah' is replaced by the real name of C1, do you then get undefined ?

Kind regards, Hans

seriousme avatar Jun 10 '22 17:06 seriousme

I just checked, I can reproduce the problem e.g.:

Running npm test while redis-cli monitor is active gives things like:

1654887375.958952 [0 172.17.0.1:40986] "get" "will:undefined:12345"
1654887375.961158 [0 172.17.0.1:40986] "del" "will:undefined:12345"

1654887376.057816 [0 172.17.0.1:40998] "lrem" "will" "0" "will:undefined:42"
1654887376.058510 [0 172.17.0.1:40998] "get" "will:undefined:42"
1654887376.059604 [0 172.17.0.1:40998] "del" "will:undefined:42"

1654887377.318774 [0 172.17.0.1:41046] "rpush" "outgoing:undefined" "packet:1:42"

However I also see many rpush records containing a correct client ID:

1654887374.027841 [0 172.17.0.1:40842] "rpush" "outgoing:abcde" "packet:broker-42:42"
1654887374.079341 [0 172.17.0.1:40854] "rpush" "outgoing:abcde" "packet:broker-42:42"
1654887374.080908 [0 172.17.0.1:40854] "rpush" "outgoing:fghih" "packet:broker-42:42"
1654887374.208707 [0 172.17.0.1:40866] "rpush" "outgoing:abcde" "packet:broker-42:10"
1654887374.229560 [0 172.17.0.1:40866] "rpush" "outgoing:abcde" "packet:broker-42:50"
1654887374.348288 [0 172.17.0.1:40878] "rpush" "outgoing:abcde" "packet:broker-42:42"
1654887374.400190 [0 172.17.0.1:40890] "rpush" "outgoing:abcde" "packet:broker-42:42"
1654887374.459942 [0 172.17.0.1:40902] "rpush" "outgoing:abcde" "packet:broker-42:42"
1654887374.515436 [0 172.17.0.1:40914] "rpush" "outgoing:abcde" "packet:broker-42:42"
1654887374.542930 [0 172.17.0.1:40914] "rpush" "outgoing:abcde" "packet:broker-42:43"
1654887374.613735 [0 172.17.0.1:40926] "rpush" "outgoing:abcde" "packet:broker-42:1"
1654887374.626464 [0 172.17.0.1:40926] "rpush" "outgoing:abcde" "packet:broker-42:2"
1654887374.631670 [0 172.17.0.1:40926] "rpush" "outgoing:abcde" "packet:broker-42:3"
1654887374.634334 [0 172.17.0.1:40926] "rpush" "outgoing:abcde" "packet:broker-42:4"
1654887374.636782 [0 172.17.0.1:40926] "rpush" "outgoing:abcde" "packet:broker-42:5"
1654887374.643630 [0 172.17.0.1:40926] "rpush" "outgoing:abcde" "packet:broker-42:6"
1654887374.648580 [0 172.17.0.1:40926] "rpush" "outgoing:abcde" "packet:broker-42:7"
1654887374.652527 [0 172.17.0.1:40926] "rpush" "outgoing:abcde" "packet:broker-42:8"
1654887374.662248 [0 172.17.0.1:40926] "rpush" "outgoing:abcde" "packet:broker-42:9"
1654887374.668297 [0 172.17.0.1:40926] "rpush" "outgoing:abcde" "packet:broker-42:10"
1654887374.673134 [0 172.17.0.1:40926] "rpush" "outgoing:abcde" "packet:broker-42:11"
1654887374.676328 [0 172.17.0.1:40926] "rpush" "outgoing:abcde" "packet:broker-42:12"
1654887374.688149 [0 172.17.0.1:40926] "rpush" "outgoing:abcde" "packet:broker-42:13"
1654887374.691820 [0 172.17.0.1:40926] "rpush" "outgoing:abcde" "packet:broker-42:14"
1654887374.694277 [0 172.17.0.1:40926] "rpush" "outgoing:abcde" "packet:broker-42:15"
1654887374.699788 [0 172.17.0.1:40926] "rpush" "outgoing:abcde" "packet:broker-42:16"
1654887374.703262 [0 172.17.0.1:40926] "rpush" "outgoing:abcde" "packet:broker-42:17"
1654887374.706152 [0 172.17.0.1:40926] "rpush" "outgoing:abcde" "packet:broker-42:18"
1654887374.709262 [0 172.17.0.1:40926] "rpush" "outgoing:abcde" "packet:broker-42:19"
1654887374.714574 [0 172.17.0.1:40926] "rpush" "outgoing:abcde" "packet:broker-42:20"
1654887374.717212 [0 172.17.0.1:40926] "rpush" "outgoing:abcde" "packet:broker-42:21"
1654887374.719691 [0 172.17.0.1:40926] "rpush" "outgoing:abcde" "packet:broker-42:22"
1654887374.725906 [0 172.17.0.1:40926] "rpush" "outgoing:abcde" "packet:broker-42:23"
1654887374.730525 [0 172.17.0.1:40926] "rpush" "outgoing:abcde" "packet:broker-42:24"
1654887374.739772 [0 172.17.0.1:40926] "rpush" "outgoing:abcde" "packet:broker-42:25"
1654887374.741978 [0 172.17.0.1:40926] "rpush" "outgoing:abcde" "packet:broker-42:26"
1654887374.743824 [0 172.17.0.1:40926] "rpush" "outgoing:abcde" "packet:broker-42:27"
1654887374.745395 [0 172.17.0.1:40926] "rpush" "outgoing:abcde" "packet:broker-42:28"
1654887374.747189 [0 172.17.0.1:40926] "rpush" "outgoing:abcde" "packet:broker-42:29"
1654887374.749199 [0 172.17.0.1:40926] "rpush" "outgoing:abcde" "packet:broker-42:30"
1654887374.751009 [0 172.17.0.1:40926] "rpush" "outgoing:abcde" "packet:broker-42:31"
1654887374.752954 [0 172.17.0.1:40926] "rpush" "outgoing:abcde" "packet:broker-42:32"
1654887375.377706 [0 172.17.0.1:40938] "rpush" "outgoing:abcde" "packet:broker-42:42"
1654887375.393787 [0 172.17.0.1:40938] "rpush" "outgoing:abcde" "packet:broker-42:50"
1654887375.659640 [0 172.17.0.1:40950] "rpush" "outgoing:abcde" "packet:broker-42:42"
1654887375.683112 [0 172.17.0.1:40950] "rpush" "outgoing:abcde" "outgoing:abcde:broker-42:42"
1654887375.829636 [0 172.17.0.1:40972] "rpush" "will" "will:broker-42:12345"
1654887375.931536 [0 172.17.0.1:40986] "rpush" "will" "will:broker-42:12345"
1654887376.035975 [0 172.17.0.1:40998] "rpush" "will" "will:broker-42:42"
1654887376.044358 [0 172.17.0.1:40998] "rpush" "will" "will:anotherBroker:24"
1654887376.100919 [0 172.17.0.1:41010] "rpush" "will" "will:broker-42:42"
1654887376.175829 [0 172.17.0.1:41034] "rpush" "outgoing:ttlTest" "packet:1:42"

I will check later to see if I can find which test(s) cause the undefined.

Kind regards, Hans

seriousme avatar Jun 10 '22 19:06 seriousme

Ok, there were 7 occurences of undefined in Redis Monitor generated by npm test

The first 3:

1654887375.951618 [0 172.17.0.1:40986] "lrem" "will" "0" "will:undefined:12345"
1654887375.958952 [0 172.17.0.1:40986] "get" "will:undefined:12345"
1654887375.961158 [0 172.17.0.1:40986] "del" "will:undefined:12345"

are caused by: https://github.com/moscajs/aedes-persistence/blob/main/abstract.js#L1483 Where:

brokerId: instance.broker.id,

needs to be added to the client object.

The next 3:

1654887376.057816 [0 172.17.0.1:40998] "lrem" "will" "0" "will:undefined:42"
1654887376.058510 [0 172.17.0.1:40998] "get" "will:undefined:42"
1654887376.059604 [0 172.17.0.1:40998] "del" "will:undefined:42"

are caused by: https://github.com/moscajs/aedes-persistence/blob/main/abstract.js#L1515 Where:

brokerId: instance.broker.id,

needs to be added to the client object. I would then add the same to anotherClient at https://github.com/moscajs/aedes-persistence/blob/main/abstract.js#L1518

The last one is:

1654887377.318774 [0 172.17.0.1:41046] "rpush" "outgoing:undefined" "packet:1:42"

And that is caused by:

https://github.com/moscajs/aedes-persistence-redis/blob/10a9fc8009dbc87e7be1a6c370057f9e9c95a0a3/test.js#L126

Where client.clientId needs to be replaced by client.id

With these fixes in I don't see any undefined anymore in Redis Monitor during npm test.

So the undefined found by @b-sirius-a1v must have another cause. Either the clientID contains an invalid character that can't be encoded by encodeURLComponent or something else goes wrong.

One thing that comes to mind is that brokerId is often used as part of a key, as I understood the brokerId is randomly generated by Aedes at startup and not read from the database. Could that be triggering this behaviour ?

Kind regards, Hans Ps. I have fixed all 3 items in the PR's mentioned below ;-)

seriousme avatar Jun 10 '22 22:06 seriousme

Small update: I just tried to replicate the issue on [email protected] with the following steps:

- start redis
- start example.js
- mqtt subscribe -i subscriber -t hello -q 1
- mqtt publish -i publisher -t hello -m message1 -q 1  

The subscriber now sees message1

- stop the subscriber
- mqtt publish -i publisher -t hello -m message2 -q 1
- mqtt subscribe -i subscriber -t hello -q 1
- mqtt publish -i publisher -t hello -m message3 -q 1

The subscriber only sees message3 and not message2.

So the problem seems to be older than 9.0.0.

Kind regards, Hans

seriousme avatar Jun 11 '22 11:06 seriousme

Ok, my bad :-( The exact scenario is:

- start redis
- start example.js
- mqtt subscribe -i subscriber -t hello -q 1
- stop example.js
- start example.js
- mqtt publish -i publisher -t hello -m message1 -q 1  
- mqtt subscribe -i subscriber -t hello -q 1 --no-clean

This shows message1 on 8.0.1

Now we stop everything, including Redis to clean the cache ! We upgrade to 9.0.0 repeat the procedure and message1 does not show up :-(

The Redis Monitor log shows for 9.0.0:

"info"
"info"
"info"
"psubscribe" "$SYS/sub/*"
"psubscribe" "$SYS/*/heartbeat"
"psubscribe" "$SYS/*/new/clients"
"smembers" "clients"
"hgetall" "client:subscriber"
"hgetall" "client:publisher"
"lrem" "will" "0" "will:undefined:publisher"
"get" "will:undefined:publisher"
"del" "will:undefined:publisher"
"publish" "$SYS/95a09045-b8f0-41ed-9779-76ce4d8ce44e/new/clients" "\x82\xa2id\xb823LNF/awT9GLd5nlXSfb2w/0\xa3msg\x88\xa3cmd\xa7publish\xa8brokerId\xd9$95a09045-b8f0-41ed-9779-76ce4d8ce44e\xadbrokerCounter\x01\xa5topic\xd95$SYS/95a09045-b8f0-41ed-9779-76ce4d8ce44e/new/clients\xa7payload\xc4\tpublisher\xa3qos\x00\xa6retain\xc2\xa3dup\xc2"
"lrange" "outgoing:publisher" "0" "1000"
"mset" "packet:95a09045-b8f0-41ed-9779-76ce4d8ce44e:2" "\x89\xa3cmd\xa7publish\xa8brokerId\xd9$95a09045-b8f0-41ed-9779-76ce4d8ce44e\xadbrokerCounter\x02\xa5topic\xa5hello\xa7payload\xc4\bmessage1\xa3qos\x01\xa6retain\xc2\xa3dup\xc2\xa9messageId\xc0" "packet:95a09045-b8f0-41ed-9779-76ce4d8ce44e:2:offlineCount" "1"
"rpush" "outgoing:undefined" "packet:95a09045-b8f0-41ed-9779-76ce4d8ce44e:2"
"publish" "hello" "\x82\xa2id\xb823LNF/awT9GLd5nlXSfb2w/1\xa3msg\x8a\xa3cmd\xa7publish\xa8brokerId\xd9$95a09045-b8f0-41ed-9779-76ce4d8ce44e\xadbrokerCounter\x02\xa5topic\xa5hello\xa7payload\xc4\bmessage1\xa3qos\x01\xa6retain\xc2\xa3dup\xc2\xa9messageId\xc0\xa8clientId\xa9publisher"
"publish" "$SYS/95a09045-b8f0-41ed-9779-76ce4d8ce44e/disconnect/clients" "\x82\xa2id\xb823LNF/awT9GLd5nlXSfb2w/2\xa3msg\x88\xa3cmd\xa7publish\xa8brokerId\xd9$95a09045-b8f0-41ed-9779-76ce4d8ce44e\xadbrokerCounter\x03\xa5topic\xd9<$SYS/95a09045-b8f0-41ed-9779-76ce4d8ce44e/disconnect/clients\xa7payload\xc4\tpublisher\xa3qos\x00\xa6retain\xc2\xa3dup\xc2"

The Redis Monitor log shows for 8.0.1:

"info"
"info"
"info"
"psubscribe" "$SYS/sub/*"
"psubscribe" "$SYS/*/heartbeat"
"psubscribe" "$SYS/*/new/clients"
"smembers" "clients"
"hgetall" "client:subscriber"
"hgetall" "client:publisher"
"lrem" "will" "0" "will:undefined:publisher"
"get" "will:undefined:publisher"
"del" "will:undefined:publisher"
"publish" "$SYS/931593e9-0ace-44be-86e0-c09b5cec14af/new/clients" "\x82\xa2id\xb8+8tEq2ANRvyKhyTRfMwxbg/0\xa3msg\x88\xa3cmd\xa7publish\xa8brokerId\xd9$931593e9-0ace-44be-86e0-c09b5cec14af\xadbrokerCounter\x01\xa5topic\xd95$SYS/931593e9-0ace-44be-86e0-c09b5cec14af/new/clients\xa7payload\xc4\tpublisher\xa3qos\x00\xa6retain\xc2\xa3dup\xc2"
"lrange" "outgoing:publisher" "0" "1000"
"mset" "packet:931593e9-0ace-44be-86e0-c09b5cec14af:2" "\x89\xa3cmd\xa7publish\xa8brokerId\xd9$931593e9-0ace-44be-86e0-c09b5cec14af\xadbrokerCounter\x02\xa5topic\xa5hello\xa7payload\xc4\bmessage1\xa3qos\x01\xa6retain\xc2\xa3dup\xc2\xa9messageId\xc0" "packet:931593e9-0ace-44be-86e0-c09b5cec14af:2:offlineCount" "1"
"rpush" "outgoing:subscriber" "packet:931593e9-0ace-44be-86e0-c09b5cec14af:2"
"publish" "hello" "\x82\xa2id\xb8+8tEq2ANRvyKhyTRfMwxbg/1\xa3msg\x8a\xa3cmd\xa7publish\xa8brokerId\xd9$931593e9-0ace-44be-86e0-c09b5cec14af\xadbrokerCounter\x02\xa5topic\xa5hello\xa7payload\xc4\bmessage1\xa3qos\x01\xa6retain\xc2\xa3dup\xc2\xa9messageId\xc0\xa8clientId\xa9publisher"
"publish" "$SYS/931593e9-0ace-44be-86e0-c09b5cec14af/disconnect/clients" "\x82\xa2id\xb8+8tEq2ANRvyKhyTRfMwxbg/2\xa3msg\x88\xa3cmd\xa7publish\xa8brokerId\xd9$931593e9-0ace-44be-86e0-c09b5cec14af\xadbrokerCounter\x03\xa5topic\xd9<$SYS/931593e9-0ace-44be-86e0-c09b5cec14af/disconnect/clients\xa7payload\xc4\tpublisher\xa3qos\x00\xa6retain\xc2\xa3dup\xc2"

The major difference being in 9.0.0:

"rpush" "outgoing:undefined" "packet:95a09045-b8f0-41ed-9779-76ce4d8ce44e:2"

instead of in 8.0.1:

"rpush" "outgoing:subscriber" "packet:931593e9-0ace-44be-86e0-c09b5cec14af:2"

Now earlier tests have already shown that rpush gets a clientId in tests. But appearently the tests do not fully match the real life scenario :-(

Kind regards, Hans

seriousme avatar Jun 11 '22 12:06 seriousme

OK, I did some more digging and found the issue :-)

The issue is not with the rpush but with rebuilding the trie on startup, caused by https://github.com/moscajs/aedes-persistence-redis/pull/96/commits/7d517a70948f1fb44d7dd650c491bd382c10e9b4.

Unfortunately in: https://github.com/moscajs/aedes-persistence-redis/blob/10a9fc8009dbc87e7be1a6c370057f9e9c95a0a3/persistence.js#L232

we should have put:

that._db.hgetallBuffer(clientSubKey(clientId), function clientHash (err, hash) {

since we also do: https://github.com/moscajs/aedes-persistence-redis/blob/10a9fc8009dbc87e7be1a6c370057f9e9c95a0a3/persistence.js#L128

An similarly we should use msgpack.decode when processing the result in: https://github.com/moscajs/aedes-persistence-redis/blob/10a9fc8009dbc87e7be1a6c370057f9e9c95a0a3/persistence.js#L565

Fixing those 2 results in correct loading of stored subscriptions on restart.

I will add a PR for this.

Kind regards, Hans

seriousme avatar Jun 11 '22 17:06 seriousme

Btw: having some method of type checking would have helped in avoiding the need for all 3 PR's ;-)

seriousme avatar Jun 11 '22 17:06 seriousme

One more thing:

"lrem" "will" "0" "will:undefined:publisher"
"get" "will:undefined:publisher"
"del" "will:undefined:publisher"

seems to be a bug in Aedes itself, caused by inconsistencies in the interface :-(

The sequence comes from the delWill function.

What seems to be a bit odd is that dellWill gets its brokerId as a parameter from Aedes where getWill and putWill rely on this.broker.id in aedes-persistence-redis which is derived from this.broker.id in Aedes.

Looking at Aedes there are 4 locations that call delWill:

  1. the test does not pass a brokerId at all, so it seems to assume that delWill relies on this.broker.id
  2. lib/handlers.connect.js does not seem to pass a brokerId either
  3. aedes.js seems to explicitly assume that brokerId is already encoded with the will
  4. /lib/client.js is the only one that adds `that.broker.id

My suggestion would be to explicitly pass the brokerId tot getWill and putWill as well and remove the implicit dependency on this.broker.id (getWill and putWill are the only functions relying on this.brokerId) and thereby create a consistent interface ;-)

Kind regards, Hans

seriousme avatar Jun 12 '22 10:06 seriousme

What seems to be a bit odd is that dellWill gets its brokerId as a parameter from Aedes where getWill and putWill rely on this.broker.id in aedes-persistence-redis which is derived from this.broker.id in Aedes.

This makes sense as in a cluster env dellWill should delete any existing will message with that clientId without considering brokerId that could be a different one.

If you see mongodb persistence it uses only the client id to identify a will message.

This is a bug that should be fixed on aedes persistence IMO, the will key should not contain the broker id

robertsLando avatar Jun 13 '22 09:06 robertsLando

Edit: I see that NPM is still on 9.0.0, so following may be somewhat irrelevant.

I think this is still broken.

relevant package dependencies "aedes": "^0.47.0", "aedes-persistence": "^9.1.2", "aedes-persistence-redis": "^9.0.0"

in subscriptionsByTopic, I'm logging as follows:

  subscriptionsByTopic (topic, cb) {
    if (!this.ready) {
      this.once('ready', this.subscriptionsByTopic.bind(this, topic, cb))
      return this
    }
    const result = this._trie.match(topic)
    console.log('subscriptionsByTopic', topic, result);
    cb(null, result)
  }

which yields -

subscriptionsByTopic devices/test [
  {
    clientId: undefined,
    topic: undefined,
    qos: undefined,
    rh: undefined,
    rap: undefined,
    nl: undefined
  }
]

Which of course inserts undefined

Screenshot 2022-07-08 08 37 11

friesendrywall avatar Jul 08 '22 12:07 friesendrywall

I'm not sure which is the correct place to post, but it would be good to update dependencies to require the correct ones, as well as release redis persistence to npm, otherwise the standard npm install aedes-persistence-redis is going to yield these results.

I think it requires "aedes-persistence": "^9.1.2",

friesendrywall avatar Jul 08 '22 13:07 friesendrywall

It already uses aedes-persistence 9.1.2, it uses aedes-cached-persistence 9 that uses it: https://github.com/moscajs/aedes-cached-persistence/blob/master/package.json

robertsLando avatar Jul 08 '22 14:07 robertsLando