pending dial queue never empties, causing node to not accept new connections
-
Version: libp2p: 2.10.0
-
Platform:
-
Subsystem: ReconnectQueue
Severity:
Critical
Description:
I have a pretty simple test where I am dialing a node, and waiting for my dialing node to confirm they share the same protocol (handler) and closing the connection at some point,, and then I do it all over again.
I can do this 2 times until my remote node no longer accept new incoming dials. And logging the stats of it I see
connections total=0 inbound=0 outbound=0 | dialQueue pending=2
(it stays like this for more than 24 hours)
Steps to reproduce the error:
https://github.com/libp2p/js-libp2p/blob/main/packages/libp2p/src/connection-manager/reconnect-queue.ts
this.queue.add(async (options) => {
await pRetry(async (attempt) => {
if (!this.started) {
return
}
try {
await this.connectionManager.openConnection(peerId, {
signal: options?.signal
})
} catch (err) {
this.log('reconnecting to %p attempt %d of %d failed - %e', peerId, attempt, this.retries, err)
throw err
}
}, {
signal: options?.signal,
retries: this.retries,
factor: this.backoffFactor,
minTimeout: this.retryInterval
})
}, {
peerId
})
In this code I have put a log before "this.connectionManager.openConnection"
and it seems to be stuck forever for me in that call. (I hotpatched am logging before this.connectionManager.openConnection and in a finally statement, and I never see the finally statement to be processed.
and I wonder whether
a timeout signal should be passed to
this.connectionManager.openConnection
Another problematic code path thinking about is that if a peer is redialing before a connection is setup, are we closing/aborting the call in process? and so we can restart all over directly, quickly, without having to wait for potential timeout?
I apologize for not given a isolated, reproducible example, but wanted to write this issue quickly to get awarness/help and also make other devs see this too
Could you add a reproducible example?
import { describe, it, expect, afterEach } from "vitest";
/* eslint-env mocha */
import { KEEP_ALIVE, type Libp2p } from '@libp2p/interface'
import { createLibp2p } from 'libp2p'
import { tcp, TCPOptions } from '@libp2p/tcp'
import { noise } from '@chainsafe/libp2p-noise'
import pDefer from 'p-defer'
const delay = (ms: number) => new Promise(resolve => setTimeout(resolve, ms))
const waitForResolve = (fn: () => void, timeout: number = 1e4) => {
return new Promise<void>(async (resolve, reject) => {
let done = false
let lastError: Error | null = null
const timer = setTimeout(() => {
if (!done) {
done = true
reject(lastError ?? new Error('Timeout waiting for resolve'));
}
}, timeout)
while (!done) {
try {
fn()
done = true
resolve()
clearTimeout(timer)
} catch (err) {
// try again
lastError = err as Error
}
await delay(1e3)
}
if (!done) {
done = true
reject(new Error('Timeout waiting for resolve'))
}
})
}
describe('core', () => {
let c1: Libp2p, c2: Libp2p
afterEach(async () => {
await c1.stop()
await c2.stop()
})
it('can redial 15 times', async () => {
c1 = await createLibp2p({
addresses: {
listen: [
'/ip4/0.0.0.0/tcp/0'
]
},
transports: [tcp()],
connectionEncrypters: [noise()],
});
for (let i = 0; i < 15; i++) {
console.log('restart', i)
await c2?.stop();
try {
c2 = await createLibp2p({
transports: [tcp()],
connectionEncrypters: [noise()],
});
await c2.dial(c1.getMultiaddrs())
await waitForResolve(() => expect(c1.getConnections(c2.peerId)).to.not.be.empty)
} finally {
await c2.stop()
}
}
})
it('can redial 15 times with keep alive', async () => {
c1 = await createLibp2p({
addresses: {
listen: [
'/ip4/0.0.0.0/tcp/0'
]
},
transports: [tcp()],
connectionEncrypters: [noise()],
});
for (let i = 0; i < 15; i++) {
console.log('restart', i)
await c2?.stop();
try {
c2 = await createLibp2p({
addresses: {
listen: [
'/ip4/0.0.0.0/tcp/0'
]
},
transports: [tcp()],
connectionEncrypters: [noise()]
});
await c2.dial(c1.getMultiaddrs())
await waitForResolve(() => expect(c1.getConnections(c2.peerId)).to.not.be.empty)
await c1.peerStore.merge(c2.peerId, {
multiaddrs: c2.getMultiaddrs(),
tags: {
[KEEP_ALIVE]: {
value: 1
}
}
})
} finally {
await c2.stop()
}
}
})
it('stalling openConnection does not block new dials', async () => {
const deferedd = pDefer();
try {
const tcpLaggy = (options: TCPOptions) => {
const fn = tcp(options);
const opened = (components: any) => {
const open = fn(components)
const dialFn = open.dial.bind(open)
open.dial = async (ma: any, options: any) => {
console.log("----> reconnecting to ", ma.toString())
await deferedd.promise // never allows a connection to be opened (when c1/server i reconnecting back)
console.error("should not happen")
return dialFn(ma, options)
}
return open
}
return opened
}
c1 = await createLibp2p({
addresses: {
listen: [
'/ip4/0.0.0.0/tcp/0'
]
},
transports: [tcpLaggy({})],
connectionEncrypters: [noise()],
connectionManager: {
maxIncomingPendingConnections: 1,
inboundConnectionThreshold: 1,
reconnectRetries: 10,
reconnectRetryInterval: 1e4,
maxDialQueueLength: 10
}
});
console.log("c1")
console.log("c1 peerId", c1.peerId.toString())
console.log("c1 multiaddrs", c1.getMultiaddrs())
console.log("--------------- connect close loop ---------------")
// make it start to re-connect but terminating the connection
for (let i = 0; i < 5; i++) {
console.log('restart', i)
await c2?.stop();
try {
c2 = await createLibp2p({
transports: [tcp()],
connectionEncrypters: [noise()],
addresses: {
listen: [
'/ip4/0.0.0.0/tcp/0' // important, we want c1 to be able to redial to c2
]
},
});
console.log("created c2")
console.log("c2 peerId", c2.peerId.toString())
console.log("c2 multiaddrs", c2.getMultiaddrs())
await c2.dial(c1.getMultiaddrs())
await waitForResolve(() => expect(c1.getConnections(c2.peerId)).to.not.be.empty)
await c1.peerStore.merge(c2.peerId, {
multiaddrs: c2.getMultiaddrs(),
tags: {
[KEEP_ALIVE]: {
value: 1
}
}
})
} finally {
console.log("stopping c2", c2.peerId.toString())
await c2.stop()
}
}
console.log("--------------- normal connection ---------------")
console.log("Dialqueue length", c1.getDialQueue().length)
// now try a normal connection!
try {
c2 = await createLibp2p({
transports: [tcp()],
connectionEncrypters: [noise()],
addresses: {
listen: [
'/ip4/0.0.0.0/tcp/0'
]
},
});
await c2.dial(c1.getMultiaddrs())
await waitForResolve(() => expect(c1.getConnections(c2.peerId)).to.not.be.empty)
console.log("Succesfully dialed normally again")
} finally {
await c2.stop()
}
} finally {
deferedd.resolve()
}
})
})
the last test is not reproducing anything yet, but the first 2 fails after 9 dails.
I am still investigating why I have observed that .openConnection in production use, stalls forever for me in the reconnect queue.
thanks - can replicate the issue
If you run the tests with DEBUG=libp2p:*,*:trace you can see where its failing. Will take a look
restart 9
2025-09-21T09:24:44.829Z libp2p:transports adding transport @libp2p/tcp
2025-09-21T09:24:44.829Z libp2p:connection-manager started
2025-09-21T09:24:44.829Z libp2p:transports no addresses were provided for listening, this node is dial only
2025-09-21T09:24:44.830Z libp2p:connection-manager dial 12D3KooWBNoGfQv3a1KxHhsmUVzRZQ52gJb3k19sRe8zBMo4FRsg
2025-09-21T09:24:44.830Z libp2p:connection-manager:dial-queue creating dial target for 12D3KooWBNoGfQv3a1KxHhsmUVzRZQ52gJb3k19sRe8zBMo4FRsg [
'/ip4/127.0.0.1/tcp/58695/p2p/12D3KooWBNoGfQv3a1KxHhsmUVzRZQ52gJb3k19sRe8zBMo4FRsg'
]
2025-09-21T09:24:44.830Z libp2p:connection-manager:dial-queue starting dial to 12D3KooWBNoGfQv3a1KxHhsmUVzRZQ52gJb3k19sRe8zBMo4FRsg
2025-09-21T09:24:44.830Z libp2p:connection-manager:dial-queue calculating addrs to dial 12D3KooWBNoGfQv3a1KxHhsmUVzRZQ52gJb3k19sRe8zBMo4FRsg from [
'/ip4/127.0.0.1/tcp/58695/p2p/12D3KooWBNoGfQv3a1KxHhsmUVzRZQ52gJb3k19sRe8zBMo4FRsg'
]
2025-09-21T09:24:44.831Z libp2p:connection-manager:dial-queue:trace addresses for 12D3KooWBNoGfQv3a1KxHhsmUVzRZQ52gJb3k19sRe8zBMo4FRsg before filtering [
'/ip4/127.0.0.1/tcp/58695/p2p/12D3KooWBNoGfQv3a1KxHhsmUVzRZQ52gJb3k19sRe8zBMo4FRsg'
]
2025-09-21T09:24:44.831Z libp2p:connection-manager:dial-queue:trace addresses for 12D3KooWBNoGfQv3a1KxHhsmUVzRZQ52gJb3k19sRe8zBMo4FRsg after filtering [
'/ip4/127.0.0.1/tcp/58695/p2p/12D3KooWBNoGfQv3a1KxHhsmUVzRZQ52gJb3k19sRe8zBMo4FRsg'
]
2025-09-21T09:24:44.831Z libp2p:connection-manager:dial-queue starting dial to 12D3KooWBNoGfQv3a1KxHhsmUVzRZQ52gJb3k19sRe8zBMo4FRsg with [
'/ip4/127.0.0.1/tcp/58695/p2p/12D3KooWBNoGfQv3a1KxHhsmUVzRZQ52gJb3k19sRe8zBMo4FRsg'
]
2025-09-21T09:24:44.831Z libp2p:tcp dialing /ip4/127.0.0.1/tcp/58695/p2p/12D3KooWBNoGfQv3a1KxHhsmUVzRZQ52gJb3k19sRe8zBMo4FRsg
2025-09-21T09:24:44.832Z libp2p:tcp connection opened /ip4/127.0.0.1/tcp/58695/p2p/12D3KooWBNoGfQv3a1KxHhsmUVzRZQ52gJb3k19sRe8zBMo4FRsg
2025-09-21T09:24:44.832Z libp2p:tcp new outbound connection /ip4/127.0.0.1/tcp/58695/p2p/12D3KooWBNoGfQv3a1KxHhsmUVzRZQ52gJb3k19sRe8zBMo4FRsg
2025-09-21T09:24:44.833Z libp2p:tcp:connection:outbound:ck7df41758446684832:trace starting the outbound connection upgrade
2025-09-21T09:24:44.833Z libp2p:tcp:connection:outbound:ck7df41758446684832:trace selecting encrypter from [ '/noise' ]
2025-09-21T09:24:44.833Z libp2p:tcp:connection:outbound:ck7df41758446684832:trace select: write ["/multistream/1.0.0", "/noise"]
2025-09-21T09:24:44.833Z libp2p:tcp:connection:outbound:ck7df41758446684832:trace select: reading multistream-select header
2025-09-21T09:24:44.833Z libp2p:tcp:listener new inbound connection /ip4/127.0.0.1/tcp/58706
2025-09-21T09:24:44.833Z libp2p:tcp:connection:inbound:auttcg1758446684833:trace starting the inbound connection upgrade
2025-09-21T09:24:44.833Z libp2p:tcp:connection:inbound:auttcg1758446684833:trace handle: available protocols [ '/noise' ]
2025-09-21T09:24:44.833Z libp2p:tcp:connection:inbound:auttcg1758446684833:trace handle: reading incoming string
2025-09-21T09:24:44.833Z libp2p:tcp:connection:inbound:auttcg1758446684833:trace handle: read "/multistream/1.0.0"
2025-09-21T09:24:44.834Z libp2p:tcp:connection:inbound:auttcg1758446684833:trace handle: respond with "/multistream/1.0.0" for "/multistream/1.0.0"
2025-09-21T09:24:44.834Z libp2p:tcp:connection:inbound:auttcg1758446684833:trace handle: responded with "/multistream/1.0.0" for "/multistream/1.0.0"
2025-09-21T09:24:44.834Z libp2p:tcp:connection:inbound:auttcg1758446684833:trace handle: reading incoming string
2025-09-21T09:24:44.834Z libp2p:tcp:connection:inbound:auttcg1758446684833:trace handle: read "/noise"
2025-09-21T09:24:44.834Z libp2p:tcp:connection:inbound:auttcg1758446684833:trace handle: respond with "/noise" for "/noise"
2025-09-21T09:24:44.834Z libp2p:tcp:connection:inbound:auttcg1758446684833:trace handle: responded with "/noise" for "/noise"
2025-09-21T09:24:44.834Z libp2p:tcp:connection:inbound:auttcg1758446684833 encrypting inbound connection to /ip4/127.0.0.1/tcp/58706 using /noise
2025-09-21T09:24:44.834Z libp2p:noise:xxhandshake:trace Stage 0 - Responder waiting to receive first message.
2025-09-21T09:24:44.834Z libp2p:tcp:connection:outbound:ck7df41758446684832:trace select: read "/multistream/1.0.0"
2025-09-21T09:24:44.834Z libp2p:tcp:connection:outbound:ck7df41758446684832:trace select: reading protocol response
2025-09-21T09:24:44.834Z libp2p:tcp:connection:outbound:ck7df41758446684832:trace select: read "/noise"
2025-09-21T09:24:44.834Z libp2p:tcp:connection:outbound:ck7df41758446684832 encrypting outbound connection to /ip4/127.0.0.1/tcp/58695/p2p/12D3KooWBNoGfQv3a1KxHhsmUVzRZQ52gJb3k19sRe8zBMo4FRsg using /noise
2025-09-21T09:24:44.835Z libp2p:noise:xxhandshake:trace Stage 0 - Initiator starting to send first message.
2025-09-21T09:24:44.835Z libp2p:noise:xxhandshake:trace Stage 0 - Initiator finished sending first message.
2025-09-21T09:24:44.835Z libp2p:noise:xxhandshake:trace Stage 1 - Initiator waiting to receive first message from responder...
2025-09-21T09:24:44.836Z libp2p:noise:xxhandshake:trace Stage 0 - Responder received first message.
2025-09-21T09:24:44.836Z libp2p:noise:xxhandshake:trace Stage 1 - Responder sending out first message with signed payload and static key.
2025-09-21T09:24:44.839Z libp2p:noise:xxhandshake:trace Stage 1 - Responder sent the second handshake message with signed payload.
2025-09-21T09:24:44.839Z libp2p:noise:xxhandshake:trace Stage 2 - Responder waiting for third handshake message...
2025-09-21T09:24:44.841Z libp2p:noise:xxhandshake:trace Stage 1 - Initiator received the message.
2025-09-21T09:24:44.841Z libp2p:noise:xxhandshake:trace Initiator going to check remote's signature...
2025-09-21T09:24:44.841Z libp2p:noise:xxhandshake:trace All good with the signature!
2025-09-21T09:24:44.841Z libp2p:noise:xxhandshake:trace Stage 2 - Initiator sending third handshake message.
2025-09-21T09:24:44.842Z libp2p:noise:xxhandshake:trace Stage 2 - Initiator sent message with signed payload.
2025-09-21T09:24:44.842Z libp2p:connection-manager:connection-pruner checking max connections limit 1/300
2025-09-21T09:24:44.842Z libp2p:tcp:connection:outbound:ck7df41758446684832 successfully upgraded outbound connection
2025-09-21T09:24:44.843Z libp2p:connection-manager:dial-queue dial to /ip4/127.0.0.1/tcp/58695/p2p/12D3KooWBNoGfQv3a1KxHhsmUVzRZQ52gJb3k19sRe8zBMo4FRsg succeeded
2025-09-21T09:24:44.844Z libp2p:noise:xxhandshake:trace Stage 2 - Responder received the message, finished handshake.
2025-09-21T09:24:44.844Z libp2p:connection-manager:connection-pruner checking max connections limit 1/300
2025-09-21T09:24:44.844Z libp2p:tcp:connection:inbound:auttcg1758446684833 successfully upgraded inbound connection
2025-09-21T09:24:44.844Z libp2p:tcp:listener inbound connection upgraded /ip4/127.0.0.1/tcp/58706
2025-09-21T09:24:45.606Z libp2p:connection-monitor:error error during heartbeat MuxerUnavailableError: Connection is not multiplexed
at Connection.newStream (file:///Users/z/code/github.com/dozyio/js-libp2p-issues/3289/node_modules/libp2p/src/connection.ts:127:13)
at file:///Users/z/code/github.com/dozyio/js-libp2p-issues/3289/node_modules/libp2p/src/connection-monitor.ts:104:39
2025-09-21T09:24:45.611Z libp2p:connection-monitor:error aborting connection due to ping failure
2025-09-21T09:24:45.611Z libp2p:tcp:connection:inbound:auttcg1758446684833:error aborting connection to /ip4/127.0.0.1/tcp/58706/p2p/12D3KooWM67kqGZwfn8VMJCcL8CA7Sftp9mW2hKD8S8A1FpkDD2K due to error MuxerUnavailableError: Connection is not multiplexed
at Connection.newStream (file:///Users/z/code/github.com/dozyio/js-libp2p-issues/3289/node_modules/libp2p/src/connection.ts:127:13)
at file:///Users/z/code/github.com/dozyio/js-libp2p-issues/3289/node_modules/libp2p/src/connection-monitor.ts:104:39
2025-09-21T09:24:45.612Z libp2p:tcp:connection:inbound:auttcg1758446684833 inbound 127.0.0.1:58706 socket abort due to error - MuxerUnavailableError: Connection is not multiplexed
at Connection.newStream (file:///Users/z/code/github.com/dozyio/js-libp2p-issues/3289/node_modules/libp2p/src/connection.ts:127:13)
at file:///Users/z/code/github.com/dozyio/js-libp2p-issues/3289/node_modules/libp2p/src/connection-monitor.ts:104:39
2025-09-21T09:24:45.612Z libp2p:connection-manager:trace peer 12D3KooWM67kqGZwfn8VMJCcL8CA7Sftp9mW2hKD8S8A1FpkDD2K disconnected, removing connection map entry
2025-09-21T09:24:45.614Z libp2p:reconnect-queue:error failed to maybe reconnect to 12D3KooWM67kqGZwfn8VMJCcL8CA7Sftp9mW2hKD8S8A1FpkDD2K - NotFoundError: Not Found
at MemoryDatastore.get (file:///Users/z/code/github.com/dozyio/js-libp2p-issues/3289/node_modules/datastore-core/src/memory.ts:30:13)
at PersistentStore.load (file:///Users/z/code/github.com/dozyio/js-libp2p-issues/3289/node_modules/@libp2p/peer-store/src/store.ts:171:38)
at PersistentPeerStore.get (file:///Users/z/code/github.com/dozyio/js-libp2p-issues/3289/node_modules/@libp2p/peer-store/src/index.ts:114:31)
at ReconnectQueue.maybeReconnect (file:///Users/z/code/github.com/dozyio/js-libp2p-issues/3289/node_modules/libp2p/src/connection-manager/reconnect-queue.ts:67:18)
2025-09-21T09:24:45.614Z libp2p:tcp:connection:outbound:ck7df41758446684832 outbound 127.0.0.1:58695 socket end
2025-09-21T09:24:45.615Z libp2p:tcp:connection:outbound:ck7df41758446684832 outbound 127.0.0.1:58695 socket close
2025-09-21T09:24:45.615Z libp2p:tcp:connection:outbound:ck7df41758446684832 closing connection to /ip4/127.0.0.1/tcp/58695/p2p/12D3KooWBNoGfQv3a1KxHhsmUVzRZQ52gJb3k19sRe8zBMo4FRsg
2025-09-21T09:24:45.615Z libp2p:tcp:connection:outbound:ck7df41758446684832:trace closing underlying transport
2025-09-21T09:24:45.615Z libp2p:tcp:connection:outbound:ck7df41758446684832 the outbound 127.0.0.1:58695 socket is already closed
2025-09-21T09:24:45.615Z libp2p:tcp:connection:outbound:ck7df41758446684832:trace updating timeline with close time
2025-09-21T09:24:45.615Z libp2p:connection-manager:trace peer 12D3KooWBNoGfQv3a1KxHhsmUVzRZQ52gJb3k19sRe8zBMo4FRsg disconnected, removing connection map entry
2025-09-21T09:24:45.616Z libp2p:tcp:connection:inbound:auttcg1758446684833 inbound 127.0.0.1:58706 socket close
Not 100% on this but I think whats happening is that the connection manager is running ping but its failing as it can't open a stream due to the lack of muxer. Transports that don't have native muxing capability such as TCP require a muxer.
If you add streamMuxers: [yamux()] to your config the tests pass.
That seems to make the test pass, I provided!
I turned off reconnection logic temporarily now for the production use of libp2p and it seem to mitigate the problem, but I am going to check again in the coming 1-2 days if I can reproduce the issue. I seem to only be able to make it happen remotely, and only when the reconnection logic is active.
That seemed to fix the tests. Hmm. I am not able to reproduce yet, not locally but remotely somehow I can only get redialing to work if I turn off the reconnection behaviour. I will investigate more the coming 1-2 days.
Here is a log on my server when the client dials it and immediately terminate, which makes the server end up in a state where the dialqueue is never emptied
2025-09-22T06:18:07.712Z libp2p:transports adding transport @libp2p/websockets
2025-09-22T06:18:07.714Z libp2p:transports adding transport @libp2p/circuit-relay-v2-transport
2025-09-22T06:18:07.714Z libp2p:transports adding transport @libp2p/webrtc-direct
2025-09-22T06:18:07.715Z libp2p:transports adding transport @libp2p/tcp
2025-09-22T06:18:07.719Z libp2p libp2p is starting
2025-09-22T06:18:07.747Z libp2p:connection-manager started
2025-09-22T06:18:07.774Z libp2p:webrtc-direct loaded certificate, expires in 1759644622000 ms
2025-09-22T06:18:07.784Z libp2p:webrtc-direct loaded certificate, expiry time is 1759644622000
2025-09-22T06:18:07.784Z libp2p:webrtc-direct will renew TLS certificate after 1122734216 ms
2025-09-22T06:18:07.787Z libp2p:circuit-relay:transport:reservation-store removing tag from 0 old relays
2025-09-22T06:18:07.787Z libp2p:circuit-relay:transport:reservation-store redialing 0 old relays
2025-09-22T06:18:07.787Z libp2p:circuit-relay:discover-relays stop discovery
2025-09-22T06:18:07.794Z libp2p:transports creating listener for @libp2p/websockets on /ip4/127.0.0.1/tcp/8002/ws
2025-09-22T06:18:07.797Z libp2p:transports creating listener for @libp2p/webrtc-direct on /ip4/0.0.0.0/udp/4004/webrtc-direct
2025-09-22T06:18:07.797Z libp2p:webrtc-direct:listener starting UDP mux server on 0.0.0.0:4004
2025-09-22T06:18:07.798Z libp2p:transports creating listener for @libp2p/tcp on /ip4/0.0.0.0/tcp/4002
2025-09-22T06:18:07.806Z libp2p:tcp:listener listening on { address: '0.0.0.0', family: 'IPv4', port: 4002 }
2025-09-22T06:18:07.807Z libp2p libp2p has started
API available at port 8082
Starting node with address(es):
id: 12D3KooWLoaLf1m2rTEWhwqVqGAMRTpokJnGiQQQV6aDyzi1XnfZ
Addresses:
/dns4/2b3a91e49a95eba1e01bdbc5b85f038262d302bf.peerchecker.com/tcp/4002/p2p/12D3KooWLoaLf1m2rTEWhwqVqGAMRTpokJnGiQQQV6aDyzi1XnfZ
/dns4/2b3a91e49a95eba1e01bdbc5b85f038262d302bf.peerchecker.com/tcp/4003/wss/p2p/12D3KooWLoaLf1m2rTEWhwqVqGAMRTpokJnGiQQQV6aDyzi1XnfZ
/dns4/2b3a91e49a95eba1e01bdbc5b85f038262d302bf.peerchecker.com/udp/4004/webrtc-direct/certhash/uEiDPHTvbr_1phF3jExW72ln4aajU_tW-3bo5RBpoN1TWpw/p2p/12D3KooWLoaLf1m2rTEWhwqVqGAMRTpokJnGiQQQV6aDyzi1XnfZ
2025-09-22T06:20:13.903Z libp2p:websockets:listener new inbound connection /ip4/127.0.0.1/tcp/38024
2025-09-22T06:20:14.020Z libp2p:websockets:connection:inbound:9lfm391758522013904 encrypting inbound connection to /ip4/127.0.0.1/tcp/38024 using /noise
2025-09-22T06:20:14.277Z libp2p:connection-manager:connection-pruner checking max connections limit 1/Infinity
2025-09-22T06:20:14.277Z libp2p:websockets:connection:inbound:9lfm391758522013904 successfully upgraded inbound connection
2025-09-22T06:20:14.284Z libp2p:websockets:connection:inbound:9lfm391758522013904 incoming /ipfs/id/1.0.0 stream opened
2025-09-22T06:20:14.402Z libp2p:yamux:outbound:2 selected protocol /ipfs/id/1.0.0
2025-09-22T06:20:14.408Z libp2p:address-manager:observed-addresses adding observed address /dns4/2b3a91e49a95eba1e01bdbc5b85f038262d302bf.peerchecker.com/tcp/4003/wss
2025-09-22T06:20:14.408Z libp2p:websockets:connection:inbound:9lfm391758522013904:identify completed for peer 12D3KooWRamR2B1X4Y3ze5PyR3uYFQdJZbWyAyuRfoaQGkdkLG3n and protocols [ '/ipfs/id/1.0.0', '/lazyblock/0.0.0', '/lazysub/0.0.0', '/libp2p/circuit/relay/0.2.0/hop', '/libp2p/circuit/relay/0.2.0/stop' ]
2025-09-22T06:20:14.408Z libp2p:websockets:connection:inbound:9lfm391758522013904:identify received identify from 12D3KooWRamR2B1X4Y3ze5PyR3uYFQdJZbWyAyuRfoaQGkdkLG3n
2025-09-22T06:20:14.416Z libp2p:websockets:connection:inbound:9lfm391758522013904 incoming /libp2p/circuit/relay/0.2.0/hop stream opened
2025-09-22T06:20:14.417Z libp2p:websockets:connection:inbound:9lfm391758522013904 incoming /lazyblock/0.0.0 stream opened
2025-09-22T06:20:14.418Z libp2p:websockets:connection:inbound:9lfm391758522013904 incoming /lazysub/0.0.0 stream opened
2025-09-22T06:20:14.420Z libp2p:circuit-relay:server received circuit v2 hop protocol stream from 12D3KooWRamR2B1X4Y3ze5PyR3uYFQdJZbWyAyuRfoaQGkdkLG3n
2025-09-22T06:20:14.535Z libp2p:circuit-relay:server received RESERVE
2025-09-22T06:20:14.535Z libp2p:circuit-relay:server received hop message
2025-09-22T06:20:14.536Z libp2p:circuit-relay:server hop reserve request from 12D3KooWRamR2B1X4Y3ze5PyR3uYFQdJZbWyAyuRfoaQGkdkLG3n
2025-09-22T06:20:14.536Z libp2p:circuit-relay:server:reservation-store creating new reservation for client 12D3KooWRamR2B1X4Y3ze5PyR3uYFQdJZbWyAyuRfoaQGkdkLG3n
2025-09-22T06:20:14.539Z libp2p:yamux:outbound:4 selected protocol /lazyblock/0.0.0
2025-09-22T06:20:14.543Z libp2p:circuit-relay:server sent confirmation response to 12D3KooWRamR2B1X4Y3ze5PyR3uYFQdJZbWyAyuRfoaQGkdkLG3n
2025-09-22T06:20:14.548Z libp2p:yamux:outbound:6 selected protocol /lazysub/0.0.0
2025-09-22T06:20:14.551Z libp2p:websockets:connection:inbound:9lfm391758522013904 closed locally, code 1005, reason "", wasClean true
2025-09-22T06:20:14.551Z libp2p:websockets:connection:inbound:9lfm391758522013904 closing connection to /ip4/127.0.0.1/tcp/38024/p2p/12D3KooWRamR2B1X4Y3ze5PyR3uYFQdJZbWyAyuRfoaQGkdkLG3n
2025-09-22T06:20:14.553Z libp2p:yamux sending GoAway reason=NormalTermination
2025-09-22T06:20:14.556Z libp2p:connection-manager dial 12D3KooWRamR2B1X4Y3ze5PyR3uYFQdJZbWyAyuRfoaQGkdkLG3n
2025-09-22T06:20:14.556Z libp2p:connection-manager:dial-queue creating dial target for 12D3KooWRamR2B1X4Y3ze5PyR3uYFQdJZbWyAyuRfoaQGkdkLG3n []
2025-09-22T06:20:14.557Z libp2p:connection-manager:dial-queue starting dial to 12D3KooWRamR2B1X4Y3ze5PyR3uYFQdJZbWyAyuRfoaQGkdkLG3n
2025-09-22T06:20:14.557Z libp2p:connection-manager:dial-queue calculating addrs to dial 12D3KooWRamR2B1X4Y3ze5PyR3uYFQdJZbWyAyuRfoaQGkdkLG3n from []
2025-09-22T06:20:14.558Z libp2p:connection-manager:dial-queue loading multiaddrs for 12D3KooWRamR2B1X4Y3ze5PyR3uYFQdJZbWyAyuRfoaQGkdkLG3n
2025-09-22T06:20:14.560Z libp2p:connection-manager:dial-queue loaded multiaddrs for 12D3KooWRamR2B1X4Y3ze5PyR3uYFQdJZbWyAyuRfoaQGkdkLG3n [
'/ip4/127.0.0.1/tcp/53075/ws',
'/ip4/127.0.0.1/tcp/53076',
'/ip4/127.0.0.1/udp/53074/webrtc-direct/certhash/uEiBe929y4QRaV7yA4COoXDaVk1U_eRxu2hOUH8Tj8cMJtA'
]
2025-09-22T06:20:14.564Z libp2p:connection-manager:dial-queue starting dial to 12D3KooWRamR2B1X4Y3ze5PyR3uYFQdJZbWyAyuRfoaQGkdkLG3n with [
'/ip4/127.0.0.1/tcp/53076/p2p/12D3KooWRamR2B1X4Y3ze5PyR3uYFQdJZbWyAyuRfoaQGkdkLG3n',
'/ip4/127.0.0.1/tcp/53075/ws/p2p/12D3KooWRamR2B1X4Y3ze5PyR3uYFQdJZbWyAyuRfoaQGkdkLG3n',
'/ip4/127.0.0.1/udp/53074/webrtc-direct/certhash/uEiBe929y4QRaV7yA4COoXDaVk1U_eRxu2hOUH8Tj8cMJtA/p2p/12D3KooWRamR2B1X4Y3ze5PyR3uYFQdJZbWyAyuRfoaQGkdkLG3n'
]
2025-09-22T06:20:14.565Z libp2p:tcp dialing /ip4/127.0.0.1/tcp/53076/p2p/12D3KooWRamR2B1X4Y3ze5PyR3uYFQdJZbWyAyuRfoaQGkdkLG3n
2025-09-22T06:20:14.568Z libp2p:tcp:error dial to /ip4/127.0.0.1/tcp/53076/p2p/12D3KooWRamR2B1X4Y3ze5PyR3uYFQdJZbWyAyuRfoaQGkdkLG3n errored - Error: connect ECONNREFUSED 127.0.0.1:53076
at TCPConnectWrap.afterConnect [as oncomplete] (node:net:1637:16)
2025-09-22T06:20:14.568Z libp2p:connection-manager:dial-queue:error dial failed to /ip4/127.0.0.1/tcp/53076/p2p/12D3KooWRamR2B1X4Y3ze5PyR3uYFQdJZbWyAyuRfoaQGkdkLG3n Error: connect ECONNREFUSED 127.0.0.1:53076
at TCPConnectWrap.afterConnect [as oncomplete] (node:net:1637:16) {
errno: -111,
code: 'ECONNREFUSED',
syscall: 'connect',
address: '127.0.0.1',
port: 53076
}
2025-09-22T06:20:14.570Z libp2p:websockets dialing /ip4/127.0.0.1/tcp/53075/ws/p2p/12D3KooWRamR2B1X4Y3ze5PyR3uYFQdJZbWyAyuRfoaQGkdkLG3n
2025-09-22T06:20:14.570Z libp2p:websockets dialing 127.0.0.1:53075
2025-09-22T06:20:14.574Z libp2p:websockets:error connection error: ConnectionFailedError: Could not connect to /ip4/127.0.0.1/tcp/53075/ws/p2p/12D3KooWRamR2B1X4Y3ze5PyR3uYFQdJZbWyAyuRfoaQGkdkLG3n
at WebSocket.
One clue is that maybe that the dialing (local node) will have the addresses
/ip4/127.0.0.1/tcp/60933/ws/p2p/12D3KooWDgMHFPDkwf3tCGZU1kbEnnrSKhXU2ikaNPAsvKusAgBX,/ip4/127.0.0.1/udp/60932/webrtc-direct/certhash/uEiD690K0ZmUuopjxzPb1kL5DNmtzQOICUz-kBIKSheU_zg/p2p/12D3KooWDgMHFPDkwf3tCGZU1kbEnnrSKhXU2ikaNPAsvKusAgBX,/ip4/127.0.0.1/tcp/60934/p2p/12D3KooWDgMHFPDkwf3tCGZU1kbEnnrSKhXU2ikaNPAsvKusAgBX
and then the dialing node will be shutdown by me. And then the server tries to redial back using these addresses (reconnection manager), but obviously these are not dialable because the are local ips.
TRANSPORT MANAGER START DIAL /ip4/127.0.0.1/tcp/63940/p2p/12D3KooWCrYyWy34TULgRHH5xozRKU9oF8R5wJbV9s3tZx81U2CA 2025-09-22T06:45:48.012Z libp2p:tcp dialing /ip4/127.0.0.1/tcp/63940/p2p/12D3KooWCrYyWy34TULgRHH5xozRKU9oF8R5wJbV9s3tZx81U2CA 2025-09-22T06:45:48.015Z libp2p:tcp:error dial to /ip4/127.0.0.1/tcp/63940/p2p/12D3KooWCrYyWy34TULgRHH5xozRKU9oF8R5wJbV9s3tZx81U2CA errored - Error: connect ECONNREFUSED 127.0.0.1:63940 at TCPConnectWrap.afterConnect [as oncomplete] (node:net:1637:16) 2025-09-22T06:45:48.016Z libp2p:connection-manager:dial-queue:error dial failed to /ip4/127.0.0.1/tcp/63940/p2p/12D3KooWCrYyWy34TULgRHH5xozRKU9oF8R5wJbV9s3tZx81U2CA Error: connect ECONNREFUSED 127.0.0.1:63940 at TCPConnectWrap.afterConnect [as oncomplete] (node:net:1637:16) { errno: -111, code: 'ECONNREFUSED', syscall: 'connect', address: '127.0.0.1', port: 63940 } FINISHED MANAGER START DIAL
TRANSPORT MANAGER START DIAL /ip4/127.0.0.1/tcp/63939/ws/p2p/12D3KooWCrYyWy34TULgRHH5xozRKU9oF8R5wJbV9s3tZx81U2CA
2025-09-22T06:45:48.018Z libp2p:websockets dialing /ip4/127.0.0.1/tcp/63939/ws/p2p/12D3KooWCrYyWy34TULgRHH5xozRKU9oF8R5wJbV9s3tZx81U2CA
2025-09-22T06:45:48.018Z libp2p:websockets dialing 127.0.0.1:63939
2025-09-22T06:45:48.023Z libp2p:websockets:error connection error: ConnectionFailedError: Could not connect to /ip4/127.0.0.1/tcp/63939/ws/p2p/12D3KooWCrYyWy34TULgRHH5xozRKU9oF8R5wJbV9s3tZx81U2CA
at WebSocket.
TRANSPORT MANAGER START DIAL /ip4/127.0.0.1/udp/63938/webrtc-direct/certhash/uEiBFkT1bCexuv_uAHIO7OH37EFD21DqUeBISJMl8YFUU7Q/p2p/12D3KooWCrYyWy34TULgRHH5xozRKU9oF8R5wJbV9s3tZx81U2CA
<------------- STUCK -------------->
seems like some process in webrtc-direct hangs and dont respect the abort or timeout signal
Removing the webrtc-direct transport on the dialing node fixes the problem (as expected). So there is a problem in there definitely
Think this is a separate issue as the tcp muxing is resolved.
Not entirely sure what you are trying to do... something like this? Client dials server and disconnects. Server tries to dial the client but can't as the client doesn't have a routable address
If the client is using circuit relay to get a routable address, enabling identifyPush might help
I did come across a bug where when a client with a relay reservation disconnects, the relay would start trying to dial the client back as the relay server tags the client with 'keep-alive', but it does give up after a while so maybe unrelated. FWIW this bug is fixed in the next release.
I did come across a bug where when a client with a relay reservation disconnects, the relay would start trying to dial the client back as the relay server tags the client with 'keep-alive', but it does give up after a while so maybe unrelated. FWIW this bug is fixed in the next release.
This sounds very related! or almost exactly like my problem. BUT I only have this problem when the client is also a using a "webrtc-direct" transport. And I get stuck at the point where the relay dials back to the client by its webrtc-direct address.
Another thing I am not sure about here is also that a client can make the server to dial back on addresses that are not publicly reachable, like 127.0.0.1 ? Please let me know if I should explain this further. The dialback behaviour can perhaps also be problematic if that address points to another server perhaps (?). I am not sure if this can happen here.
FWIW this bug is fixed in the next release.
Great!