js-libp2p
js-libp2p copied to clipboard
[switch] Maximum call stack size exceeded
/Users/alan/Code/protocol-labs/js-ipfs/node_modules/libp2p-switch/src/observer.js:41
peerInfo.then((_peerInfo) => {
^
RangeError: Maximum call stack size exceeded
at Promise.then (<anonymous>)
at willObserve (/Users/alan/Code/protocol-labs/js-ipfs/node_modules/libp2p-switch/src/observer.js:41:14)
at pull.map (/Users/alan/Code/protocol-labs/js-ipfs/node_modules/libp2p-switch/src/observer.js:34:9)
at /Users/alan/Code/protocol-labs/js-ipfs/node_modules/pull-stream/throughs/map.js:13:23
at /Users/alan/Code/protocol-labs/js-ipfs/node_modules/pull-reader/index.js:114:13
at /Users/alan/Code/protocol-labs/js-ipfs/node_modules/pull-reader/index.js:114:13
at onNext (/Users/alan/Code/protocol-labs/js-ipfs/node_modules/pull-catch/index.js:17:17)
at drain (/Users/alan/Code/protocol-labs/js-ipfs/node_modules/stream-to-pull-stream/index.js:141:18)
at Channel.<anonymous> (/Users/alan/Code/protocol-labs/js-ipfs/node_modules/stream-to-pull-stream/index.js:150:5)
at Channel.emit (events.js:182:13)
@alanshaw any other info on the state of the node when this happened? Testing/idle node/etc.
Sorry, should lave left more context - just wanted to get this logged before I forgot!
It's a JS IPFS daemon running on this branch https://github.com/ipfs/js-ipfs/pull/856
I'm using companion and I opened the WebUI, looked around a bit and clicked on the peers page to see the hundreds of nodes (very exciting btw), shortly after my daemon crashed with that error.
Biggest stack I have ever seen https://ci.ipfs.team/blue/rest/organizations/jenkins/pipelines/IPFS/pipelines/js-ipfs/branches/PR-856/runs/80/nodes/257/steps/386/log/?start=0
This appears to be an issue with the pinning code spamming dht requests in ipfs. It looks like when checking for descendants in a pin set, before completely checking locally it calls out to the network. Since the code is executing dag.get in parallel, it ends up flooding peers with dht queries on large pin sets. Calling that in series until the underlying pinning problem is fixed should mitigate this. We'll get the necessary updates added in conjunction with ipfs/js-ipfs#856.
cc: @vasco-santos
Wow! I've found that as well in a lot of bubbleprof/flame diagrams. I think that usage of async should be refactored.
Seeing this in Jenkins again today:
2018-12-19T11:47:31.635Z ipfsd-ctl:daemon:stderr /tmp/jenkins/workspace/PFS_interop_test-with-debug-NUL3VSMMUZYNRCKUQJKWVW6DEXXEDFCXGEMGCBOBYZCZKZUXXGHA@2/node_modules/libp2p-switch/src/observer.js:41
peerInfo.then((_peerInfo) => {
^
RangeError: Maximum call stack size exceeded
at Promise.then (<anonymous>)
at willObserve (/tmp/jenkins/workspace/PFS_interop_test-with-debug-NUL3VSMMUZYNRCKUQJKWVW6DEXXEDFCXGEMGCBOBYZCZKZUXXGHA@2/node_modules/libp2p-switch/src/observer.js:41:14)
at pull.map (/tmp/jenkins/workspace/PFS_interop_test-with-debug-NUL3VSMMUZYNRCKUQJKWVW6DEXXEDFCXGEMGCBOBYZCZKZUXXGHA@2/node_modules/libp2p-switch/src/observer.js:34:9)
at /tmp/jenkins/workspace/PFS_interop_test-with-debug-NUL3VSMMUZYNRCKUQJKWVW6DEXXEDFCXGEMGCBOBYZCZKZUXXGHA@2/node_modules/pull-stream/throughs/map.js:13:23
at /tmp/jenkins/workspace/PFS_interop_test-with-debug-NUL3VSMMUZYNRCKUQJKWVW6DEXXEDFCXGEMGCBOBYZCZKZUXXGHA@2/node_modules/pull-reader/index.js:114:13
at /tmp/jenkins/workspace/PFS_interop_test-with-debug-NUL3VSMMUZYNRCKUQJKWVW6DEXXEDFCXGEMGCBOBYZCZKZUXXGHA@2/node_modules/pull-reader/index.js:114:13
at /tmp/jenkins/workspace/PFS_interop_test-with-debug-NUL3VSMMUZYNRCKUQJKWVW6DEXXEDFCXGEMGCBOBYZCZKZUXXGHA@2/node_modules/pull-reader/index.js:114:13
at /tmp/jenkins/workspace/PFS_interop_test-with-debug-NUL3VSMMUZYNRCKUQJKWVW6DEXXEDFCXGEMGCBOBYZCZKZUXXGHA@2/node_modules/pull-stream/throughs/async-map.js:39:20
at decipher.decrypt (/tmp/jenkins/workspace/PFS_interop_test-with-debug-NUL3VSMMUZYNRCKUQJKWVW6DEXXEDFCXGEMGCBOBYZCZKZUXXGHA@2/node_modules/libp2p-secio/src/etm.js:64:11)
at Object.decrypt (/tmp/jenkins/workspace/PFS_interop_test-with-debug-NUL3VSMMUZYNRCKUQJKWVW6DEXXEDFCXGEMGCBOBYZCZKZUXXGHA@2/node_modules/libp2p-crypto/src/aes/index.js:25:7)
The DHT code is not present in this build.
@achingbrain can you link the build?
This is likely my fault. I removed a setImmediate where the event was emitted. That should either a) be reintroduced or b) added as a process.nextTick.
@jacobheun https://ci.ipfs.team/blue/rest/organizations/jenkins/pipelines/IPFS/pipelines/interop/branches/test-with-debug/runs/2/nodes/254/log/?start=0
This was the PR: https://github.com/libp2p/js-libp2p-switch/pull/289.
@mcollina this one? https://github.com/libp2p/js-libp2p-switch/pull/289/files#diff-08f3e9ab539253c87ea111cc61c69e29
Aha, yeah.
yes.
We were seeing the call stack prior to that switch update being released. It might be contributing, but I don't think it's the core issue.
I would recommend to add a Error.stackTraceLimit = 1024 somewhere to get a bit more of 10 lines.
Okay, I've got a branch of the interop tests passing based off the latest ipfs rc candidate, https://ci.ipfs.team/blue/organizations/jenkins/IPFS%2Finterop/detail/test%2Ffix-async/4/pipeline/254.
The underlying issue seems to be due to a lot of asynchronous code calling their callbacks synchronously, which is bloating the stack. I'll have a few PRs out in a bit for the code I updated to help alleviate the issue.
I'm seeing this too, but don't have any helpful info except that it happens for me after lots of bitswap... stack traces aren't helpful.
The code has been reimplemented since this issue, closing as stale