ekuiper icon indicating copy to clipboard operation
ekuiper copied to clipboard

panic when restart a mqtt source rule

Open TKthink opened this issue 1 year ago • 5 comments

Environment:

  • eKuiper version (e.g. 1.3.0): 1.14.1、v2alpha.10
  • Hardware configuration (e.g. lscpu): armv7 also try x86_64
  • OS (e.g. cat /etc/os-release):custom linux
  • Others:

What happened and what you expected to happen: panic when restart a mqtt source rule. expected:DO NOT actively panic unless during the program initialization phase

How to reproduce it (as minimally and precisely as possible): make a mqtt source with clientid,then make only one rule using this mqtt source,and finally switch the rule repeatedly,after few times switching,it will panic

Anything else we need to know?: panic info

panic: Failed to subscribe topic /lab/test not currently connected and ResumeSubs not set

goroutine 1189 [running]: github.com/lf-edge/ekuiper/internal/io/mqtt.(*Connection).onConnect(0xc0006220c0, {0x2781a08, 0xc0000a8a08}) C:/Users/xxx/Desktop/demo project/ekuiper/internal/io/mqtt/connection.go:66 +0x291 created by github.com/eclipse/paho%2emqtt%2egolang.(*client).startCommsWorkers in goroutine 966 C:/Users/xxx/Desktop/demo project/ekuiper/vendor/github.com/eclipse/paho.mqtt.golang/client.go:617 +0xb85

TKthink avatar Aug 07 '24 03:08 TKthink

https://github.com/lf-edge/ekuiper/blob/0cef7657f254b24ad32192f8692d05e77d06cc4c/internal/io/mqtt/connection_pool.go#L31 this code. Why get the connectionSelector value as clientid instead of getting the clientid as clientid? get the connectionSelector always get "",it makes connectionPool always have nothing

TKthink avatar Aug 07 '24 03:08 TKthink

https://github.com/lf-edge/ekuiper/blob/0cef7657f254b24ad32192f8692d05e77d06cc4c/internal/io/mqtt/connection_pool.go#L31

this code. Why get the connectionSelector value as clientid instead of getting the clientid as clientid? get the connectionSelector always get "",it makes connectionPool always have nothing

This is eKuiper level connection selector for the connection sharing between rules. It is different from mqtt client id.

ngjaying avatar Aug 07 '24 09:08 ngjaying

@TKthink Could you provide the rule spec and describe how to swtich the rule repeatedly?

Yisaer avatar Aug 07 '24 09:08 Yisaer

@TKthink Could you provide the rule spec and describe how to swtich the rule repeatedly?

here is the rule: { "id": "rule_dc3e", "name": "", "triggered": true, "sql": "SELECT\n *\nFROM\n control", "actions": [ { "log": {} } ], "options": { "restartStrategy": { "attempts": 0, "delay": 5000, "multiplier": 1, "maxDelay": 30000, "jitterFactor": 0.1 }, "debug": false, "isEventTime": false, "sendMetaToSink": false, "concurrency": 1, "lateTolerance": 0, "bufferLength": 1024, "qos": 0, "checkpointInterval": 300000 } }

Here are the steps to reproduce bug, create a mqtt stream with clientid, the clientid must be specified,then make only one rule using this mqtt stream,keep the rule in stop state, and restart kuiperd, then open the rule, stop the rule, open the rule, stop the rule, after several times switch,wait a while,it will panic like this

Serving kuiper (version - unknown) on port 20498, and restful api on http://0.0.0.0:9081.

panic: Failed to subscribe topic /neuron/write/req/t0000000001/lab1: timeout

goroutine 666 [running]: github.com/lf-edge/ekuiper/internal/io/mqtt.(*Connection).onConnect(0xc00061a280, {0x26c1a08, 0xc000367408}) C:/Users/wangl105/Desktop/demo project/ekuiper/internal/io/mqtt/connection.go:66 +0x291 created by github.com/eclipse/paho%2emqtt%2egolang.(*client).startCommsWorkers in goroutine 647 C:/Users/wangl105/Desktop/demo project/ekuiper/vendor/github.com/eclipse/paho.mqtt.golang/client.go:617 +0xb85 panic: Failed to subscribe topic /neuron/write/req/t0000000001/lab1: connection lost before Subscribe completed

goroutine 607 [running]: github.com/lf-edge/ekuiper/internal/io/mqtt.(*Connection).onConnect(0xc000200100, {0x26c1a08, 0xc000366c88}) C:/Users/wangl105/Desktop/demo project/ekuiper/internal/io/mqtt/connection.go:66 +0x291 created by github.com/eclipse/paho%2emqtt%2egolang.(*client).startCommsWorkers in goroutine 537 C:/Users/wangl105/Desktop/demo project/ekuiper/vendor/github.com/eclipse/paho.mqtt.golang/client.go:617 +0xb85 Process 26928 has exited with status 2

TKthink avatar Aug 12 '24 03:08 TKthink

my suggestion is DO NOT actively panic unless during the program initialization phase,panic will destroy the stability

TKthink avatar Aug 12 '24 03:08 TKthink

@TKthink This should be fixed in master. Could you have a try? Reopen this issue if you still have problems.

ngjaying avatar Aug 29 '24 03:08 ngjaying