Skupper on ARM keeps restarting
Skupper 1.5.0 running on a raspberry PI 4 keep restarting with following logs
2023-11-13 16:54:38.811339 +0000 SERVER (error) [C2884] Connection from ::1:53398 (to localhost:5672) failed: amqp:connection:framing-error connection aborted
2023-11-13 16:54:38.812755 +0000 SERVER (error) [C2891] Connection from ::1:53470 (to localhost:5672) failed: amqp:connection:framing-error connection aborted
2023-11-13 16:54:38.814129 +0000 SERVER (error) [C2883] Connection from ::1:53396 (to localhost:5672) failed: amqp:connection:framing-error connection aborted
and
2023-11-13 16:55:32.057049 +0000 SERVER (error) [C2880] Connection to 10.246.1.72:55671 failed: proton:io Connection timed out - disconnected 10.246.1.72:55671
2023-11-13 16:55:32.227544 +0000 FLOW_LOG (info) LOG [8bkk6:2769] BEGIN END parent=8bkk6:0 logSeverity=3 logText=LOG_SERVER: [C2880] Connection to 10.246.1.72:55671 failed: proton:io Connection timed out - disconnected 10.246.1.72:55671 sourceFile=/build/src/server.c sourceLine=1084
2023-11-13 16:55:34.092963 +0000 SERVER (error) [C2881] Connection to 10.246.1.72:55671 failed: proton:io Connection timed out - disconnected 10.246.1.72:55671
2023-11-13 16:55:34.235120 +0000 FLOW_LOG (info) LOG [8bkk6:2770] BEGIN END parent=8bkk6:0 logSeverity=3 logText=LOG_SERVER: [C2881] Connection to 10.246.1.72:55671 failed: proton:io Connection timed out - disconnected 10.246.1.72:55671 sourceFile=/build/src/server.c sourceLine=1084
2023-11-13 16:55:38.227183 +0000 SERVER (error) [C2882] Connection to 10.246.1.72:55671 failed: proton:io Connection timed out - disconnected 10.246.1.72:55671
2023-11-13 16:55:38.246242 +0000 FLOW_LOG (info) LOG [8bkk6:2771] BEGIN END parent=8bkk6:0 logSeverity=3 logText=LOG_SERVER: [C2882] Connection to 10.246.1.72:55671 failed: proton:io Connection timed out - disconnected 10.246.1.72:55671 sourceFile=/build/src/server.c sourceLine=1084
the deployment is working for ~1-4 minutes and even shows remotes and exposed services as well as access to various services accordingly. After that period of time, it seems that SSL get's out-of-sync (maybe due to hardware limitation?) and the pods get restarted, and the same behavior is reproduced (works for 1-4min than doesn't work)
I understand we do not support skupper on ARM in that relation at the moment, still I want to make everyone aware of the possible issue we might face with ARM based deployments.
here's another error I was able to capture in the service-controller/flow-collector pod
[Beacon detector module starting]
[API module starting]
API server listening on port 8010
Connection to the VAN is open
New ROUTER detected: zhg6s:0
New ROUTER detected: hbpjt:0
New ROUTER detected: qxdth:0
New CONTROLLER detected: cfa7a05c-d9bc-464c-a485-819add8f4a76
Sending FLUSH to sfe.zhg6s:0
Sending FLUSH to sfe.hbpjt:0
New CONTROLLER detected: 6e9774e6-02ff-42c6-8f85-9a63d0734605
New CONTROLLER detected: cb1c35c7-8d48-4eed-9b25-2d07f1ec15b3
New ROUTER detected: qgnsz:0
Sending FLUSH to sfe.qxdth:0
New CONTROLLER detected: 62737c3d-13d4-4c09-82bf-449625b5eeaf
New CONTROLLER detected: af10fd96-bce9-4fb7-8585-87f60810ff9e
New ROUTER detected: rg2dg:0
New ROUTER detected: 8bkk6:0
Sending FLUSH to sfe.cfa7a05c-d9bc-464c-a485-819add8f4a76
events.js:174
throw er; // Unhandled 'error' event
^
TypeError: Cannot read property 'push' of undefined
at new Record (/usr/src/src/data.js:122:32)
at Object.exports.IncomingRecord (/usr/src/src/data.js:503:23)
at recordList.forEach.item (/usr/src/src/network.js:123:18)
at Array.forEach (<anonymous>)
at Container.<anonymous> (/usr/src/src/network.js:121:20)
at Container.emit (events.js:198:13)
at Container.dispatch (/usr/src/node_modules/rhea/lib/container.js:41:33)
at Connection.dispatch (/usr/src/node_modules/rhea/lib/connection.js:261:40)
at Session.dispatch (/usr/src/node_modules/rhea/lib/session.js:456:41)
at Receiver.link.dispatch (/usr/src/node_modules/rhea/lib/link.js:62:38)
Emitted 'error' event at:
at Container.dispatch (/usr/src/node_modules/rhea/lib/container.js:41:33)
at Connection.dispatch (/usr/src/node_modules/rhea/lib/connection.js:261:40)
at Connection.input (/usr/src/node_modules/rhea/lib/connection.js:574:18)
at TLSSocket.emit (events.js:198:13)
at addChunk (_stream_readable.js:288:12)
at readableAddChunk (_stream_readable.js:269:11)
at TLSSocket.Readable.push (_stream_readable.js:224:10)
at TLSWrap.onStreamRead [as onread] (internal/stream_base_commons.js:94:17)
What image is that log from? (It is a node.js based image which is not the standard flow controller).
@grs it's based on https://github.com/skupperproject/skupper/blob/main/Dockerfile.flow-collector
@grs it's based on https://github.com/skupperproject/skupper/blob/main/Dockerfile.flow-collector
I don't think it can be as that is a go based collector and the trace is clearly from a nodejs based program.
For the record, that backtrace is from the prototype collector (nodejs). Can you run skupper version in that environment to see what images are being used?
Hi Ted,
I picked the dockerfiles from the repo ... :?
$ skupper -c pi4 -n skupper version
client version 1.4.1
transport version quay.example.com/skupper/skupper-router:2.5.0 (sha256:51f8ab009232)
controller version not-found
config-sync version quay.example.com/skupper/config-sync:1.5.0 (sha256:e60cfee4c09a)
flow-collector version not-found
$ oc --context pi4 -n skupper exec -ti deploy/skupper-service-controller -- ./service-controller -version
1.5.0
[runner@skupper-router-ffb9458b9-nvnt8 bin]$ skrouterd -v
0.0.0
[runner@skupper-router-ffb9458b9-nvnt8 bin]$ skmanage --version
0.0.0
[runner@skupper-router-ffb9458b9-nvnt8 bin]$ skstat --version
0.0.0
[root@pi4 skupper-router]# git config remote.origin.url
https://github.com/skupperproject/skupper-router
[root@pi4 skupper-router]# git branch
* main
# Containerfile used for build
[root@pi4 skupper]# git config remote.origin.url
https://github.com/skupperproject/skupper.git
[root@pi4 skupper]# git branch
* (HEAD detached at 1.5.0)
main
# Dockerfile.ci-test Dockerfile.config-sync Dockerfile.controller-podman Dockerfile.flow-collector Dockerfile.service-controller Dockerfile.site-controller used for build