js-libp2p icon indicating copy to clipboard operation
js-libp2p copied to clipboard

Stream ending does not emit event

Open D4nte opened this issue 4 years ago • 9 comments

  • Version: 0.31.0
  • Platform: Firefox 88.0 on Linux 5.10.32-1-MANJARO #1 SMP Wed Apr 21 14:54:10 UTC 2021 x86_64 GNU/Linux

Severity: High

Description:

  1. Start libp2p and listen on peer:disconnect and error events,
    libp2p.connectionManager.on('peer:disconnect', (connection) => {
      console.log('Peer disconnected', connection);
    });

    libp2p.on('error', (err) => {
      console.log('libp2p error', err);
    });
  1. Connects to a remote libp2p node using wss.
   libp2p.peerStore.addressBook.add(
      PeerId.createFromB58String(
        '16Uiu2HAmPLe7Mzm8TsYUubgCAW1aJoeFScxrLj8ppHFivPo97bUZ'
      ),
      [multiaddr('/dns4/node-01.do-ams3.jdev.misc.statusim.net/tcp/443/wss')]
    );
  1. The websocket connection is proxied by nginx, nginx default connection timeout is 60s.
  2. After 60s of idle, the stream ends,
libp2p:mplex:stream initiator stream 1 abort +15s undefined common.js:111
libp2p:mplex:stream initiator stream 1 source end +1ms undefined common.js:111
libp2p:mplex:stream initiator stream 1 sink end +2ms undefined common.js:111
libp2p:mplex initiator stream 1 1 ended +5ms common.js:111
libp2p:mplex:stream receiver stream 609346022267e73b4756e6dd abort +2ms undefined common.js:111
libp2p:mplex:stream receiver stream 609346022267e73b4756e6dd source end +0ms undefined common.js:111
libp2p:mplex:stream receiver stream 609346022267e73b4756e6dd sink end +1ms undefined common.js:111
libp2p:mplex receiver stream 1 609346022267e73b4756e6dd ended +2ms
  1. Once the stream is ended, it is not possible to send messages via gossipsub
libp2p:gossipsub publish +10m /waku/2/default-waku/proto 
Uint8Array(36) [ 10, 24, 8, 202, 144, 205, 132, 6, 18, 11, … ]
common.js:111
ibp2p:mplex:stream initiator stream 1 error +9m Error: stream aborted
    AbortError AbortError.js:3
    nextAbortHandler index.js:37
    abortHandler index.js:16
    abort stream.js:67
    onEnd mplex.js:181
    end index.js:124
    _createSink mplex.js:166
    duplexPipelineFn index.js:18
    rawPipe index.js:4
    pipe index.js:49
    _createConnection upgrader.js:268
    upgradeOutbound upgrader.js:192
    dial index.js:54
    dial transport-manager.js:111
    dialAction index.js:191
    run dial-request.js:71
    run dial-request.js:66
    _createPendingDial index.js:209
    connectToPeer index.js:115
    _maybeConnect index.js:701
    _onDidStart index.js:649
    emit events.js:153
    add address-book.js:264
    initWaku App.tsx:165
    App App.tsx:88
    React 6
    unstable_runWithPriority scheduler.development.js:653
    React 4
    unstable_runWithPriority scheduler.development.js:653
    React 6
    tsx index.tsx:6
    tsx main.chunk.js:5157
    Webpack 7
common.js:111
libp2p:mplex initiator stream 1 1 send +9m 
Object { id: 1, type: "RESET_INITIATOR", data: undefined }

Expected: an event is emitted on peer:disconnect so that I can handle the stream closing. Actual: No error or event emitted on stream closure.

Steps to reproduce the error:

I assume I am missing something here so please let me know what is the expected usage of the API. If my assumption of the expected behaviur is correct, let me know and I can try to write a minimal working example.

Or you can play with the project, it's the web-chat example of https://github.com/status-im/js-waku/

D4nte avatar May 06 '21 01:05 D4nte

Please note that it seems that pubsub is not aware that the stream is closed. Indeed, when a stream close, the peer should be removed from pubsub.peers as per

https://github.com/libp2p/js-libp2p-interfaces/blob/13f6ed6b3ecf9db98f82728359c529d7def61298/src/pubsub/index.js#L306

However, from local debugging + the fact that the stream aborted error is thrown a we can deduce the peer was not yet removed at

https://github.com/ChainSafe/js-libp2p-gossipsub/blob/966bd4724b7bee794c9be3428dcaf183193490c6/ts/index.ts#L1138

D4nte avatar May 06 '21 02:05 D4nte

Hey @D4nte

Thanks for creating this issue. It would be great to have a test case for this in gossipsub.

Starting by the expectations here. Two peers establish a libp2p connection, which will be used to create streams for given libp2p protocols (like identify, pubsub, ...). In other words, a connection will use a stream muxer to have multiple streams open within a connection. As a result, emitting a peer:disconnect event is a consequence of the raw connection to be closed and not any of the streams within the connection. For checking you still have the connection, you can get the libp2p connections from the ConnectionManager and inspect this connection current state.

Zooming in into pubsub, we have 2 libp2p streams within pubsub context per connection. One will always be open to receive pubsub messages from other pubsub peers connected and one will be open to send pubsub messages and closed after it. This means, every time we want to send a message we open the write stream, send the message and close the stream.

So, considering that the read stream is still open, it is expected that pubsub.peers will still contain the peer. Regarding the actual attempt to write to a closed stream, it can definitely be a bug and/or be related with what we are looking into https://github.com/libp2p/js-libp2p-mplex/pull/122 to guarantee we do not try to write to closed write streams. If you can create a test scenario with this, would be super helpful

vasco-santos avatar May 06 '21 08:05 vasco-santos

Hi @vasco-santos,

Thank you for the information. I am not quite sure how I can reproduce in test as the stream being aborted comes from nginx timeout, when used as a proxy.

I could play around with an http server and see if I can simulate nginx's behaviour. However, as we can see in https://github.com/libp2p/js-libp2p/pull/930 there seem to be some issues passing a http server to websockify transport to then manipulate the timeout parameters.

D4nte avatar May 10 '21 23:05 D4nte

Ok, so the problem here is not the stream end event, but the stream being aborted after the timeout. Unfortunately, we do not have the keep alive functionality in place yet for the libp2p connections. Being tracked in https://github.com/libp2p/js-libp2p/issues/744

Regarding the server, I will answer on #930 with initial findings

vasco-santos avatar May 11 '21 09:05 vasco-santos

@vasco-santos : should this issue be kept open, or will the work be tackled in #744 and #930?

BigLep avatar May 14 '21 23:05 BigLep

Closing this as the initial question of not emitting close on a closedStream is answered. @D4nte if you have further problems in this, or if you think the work planned in #930 to keep the connection alive will not solve the issue, please reopen the issue or create a new one

vasco-santos avatar May 17 '21 09:05 vasco-santos

I have been spending some time trying to recreate in test environment. It looks like the streams are closed after exactly 10min of idle (TCP timeout?). This is something I am getting systematically in a browser environment over ws but could not reproduce in test.

Something I do not understand is that:

one will be open to send pubsub messages and closed after it.

When I look at publish -> _sendRpc : https://github.com/ChainSafe/js-libp2p-gossipsub/blob/df69a909db2a770f5a7d054a898bab72020946df/ts/index.ts#L1137

I can see that it tries to get existing streams, it does not seem to attempt to open to send the message and then close it. What am I missing?

D4nte avatar Jul 22 '21 06:07 D4nte

Yes, I was thinking on #744 issue, apologies. More specifically, keeping connections alive.

Let me know if you get a reproducible test though

vasco-santos avatar Jul 26 '21 14:07 vasco-santos

I wonder if https://github.com/libp2p/js-libp2p/pull/1221 helps with this issue.

D4nte avatar May 30 '22 07:05 D4nte

@D4nte did you ever manage to create a reproduction for this issue?

achingbrain avatar Dec 13 '22 16:12 achingbrain

Oops, seems like we needed more information for this issue, please comment with more details or this issue will be closed in 7 days.

github-actions[bot] avatar Dec 20 '22 00:12 github-actions[bot]

This issue was closed because it is missing author input.

github-actions[bot] avatar Dec 28 '22 00:12 github-actions[bot]