go-libp2p-pubsub icon indicating copy to clipboard operation
go-libp2p-pubsub copied to clipboard

with gossipsub subscribers may not receive the first couple of messages

Open hugomrdias opened this issue 4 years ago • 18 comments

While trying to sync js repos with go-ipfs 0.5.x i notice pubsub tests were failing for js-ipfs-http-client the specific test works like this:

  1. peer1 swarm connects to peer2
  2. peer1 and peer2 subscribe to topicA
  3. peer2 runs ipfs.pubsub.peers() multiple times until it sees peer1
  4. peer2 publishes a msg to the topic
  5. test asserts that both peers receive the msg

So this test started to fail with go-ipfs 0.5 talking to @aschmahmann we discovered that this behaviour is due to 0.5 defaulting to gossipsub because with floodsub the test passes.

To make the test pass with gossipsub i needed to add a ~2s (may work with less just a random number) delay between step 3 and 4.

Can we remove the need for that delay and make gossipsub work like floodsub ?

issue https://github.com/ipfs/js-ipfs/issues/3030 pr https://github.com/ipfs/js-ipfs/pull/3013

hugomrdias avatar May 12 '20 15:05 hugomrdias

This is probably because "peers" is all peers we know that speak the topic while we really care about "grafted peers".

There's a new publish option WithReadyness(MinTopicSize(1)) that would allow us to require at least one peer before we let the publish succeed. However, that will stall if we're offline.

Stebalien avatar May 12 '20 23:05 Stebalien

Yes, it will take a heartbeat for the new peer to get grafted, which is by default 1s.

vyzo avatar May 19 '20 16:05 vyzo

Is changing pubsub.peers() to grafted an option?

hugomrdias avatar May 19 '20 19:05 hugomrdias

Well, ListPeers returns all the known peers that speak the same pubsub protocol(s) as us. It would be a breaking change in behavior to only return peers in the mesh -- perhaps we can add a new method that has the desired behavior.

vyzo avatar May 19 '20 19:05 vyzo

@vyzo that sounds great! @Stebalien is this something we can bubble up to go-ipfs ? maybe in a arg like ipfs pubsub peers --grafted

hugomrdias avatar May 21 '20 09:05 hugomrdias

Yes, it will take a heartbeat for the new peer to get grafted, which is by default 1s.

We shouldn't need to wait.

Stebalien avatar May 22 '20 02:05 Stebalien

This needs to be fixed the right way: https://github.com/ipfs/go-ipfs/issues/7350

Stebalien avatar May 22 '20 02:05 Stebalien

where can i follow this issue? the one you mentioned is already closed does that mean its already in master ?

hugomrdias avatar May 22 '20 06:05 hugomrdias

Actually, the delay has been fixed in v0.3.0: https://github.com/libp2p/go-libp2p-pubsub/blob/master/gossipsub.go#L952

So when we join, we immediately graft some peers.

vyzo avatar May 22 '20 17:05 vyzo

@hugomrdias https://github.com/ipfs/go-ipfs/issues/6621

Stebalien avatar May 22 '20 17:05 Stebalien

Actually, the delay has been fixed in v0.3.0:

This is biting us in the interop tests too, where the delay needs to be longer than 2s, more like ~10s~ ~20s~. Actually who knows, it seems to vary wildly.

As it's fixed in go-libp2p-pubsub master will the fix ship in go-IPFS 0.6.x?

achingbrain avatar May 29 '20 09:05 achingbrain

It looks like these interop tests are failing with go-IPFS master, even though the build step says it succeeded: https://app.circleci.com/pipelines/github/ipfs/go-ipfs/2908/workflows/0b7a5ebf-cb6c-4960-b19e-fd4c1b61983c/jobs/34278

achingbrain avatar May 29 '20 09:05 achingbrain

Would it please be possible to have some way of getting the list of peers who are likely to receive a pubsub message for a given topic? That would unblock us and fix the interop tests which can then be made to fail the go-ipfs build if compatibility is broken again.

Getting a list of peers for a topic when those peers may not receive a message sent to the topic because they aren't grafted is not very useful.

achingbrain avatar May 29 '20 16:05 achingbrain

Actually, if you enable flood publishing in gsv1.1 (WithFloodPublish option), all the direct peers and not just the ones in the mesh will receive any messages you publish (modulo dropped messages of course).

We can still add a ListMeshPeers (or some other better name) interface if you can't or don't want to enable flood publishing.

vyzo avatar May 29 '20 17:05 vyzo

ipfs pubsub peers ... given a topic, will list connected peers who are subscribed to the named topic.

I think if you provide a method that claims to return the peers subscribed to a topic, then send a message to that topic and those peers do not receive the message, the user is likely to encounter significant surprise.

achingbrain avatar May 29 '20 18:05 achingbrain

Yes, that's true -- we'll have to update if we add a different interface. But I really do recommend enabling flood publishing, it's a security feature of the protocol.

vyzo avatar May 29 '20 19:05 vyzo

Flood publishing for the first hop sounds reasonable.

Stebalien avatar May 29 '20 22:05 Stebalien

Flood publishing for the first hop sounds reasonable.

https://github.com/ipfs/go-ipfs/pull/7394

https://github.com/libp2p/go-libp2p-pubsub/issues/331#issuecomment-635884194

:roll_eyes: https://github.com/ipfs/go-ipfs/pull/7395

Stebalien avatar May 29 '20 23:05 Stebalien