go-libp2p-pubsub icon indicating copy to clipboard operation
go-libp2p-pubsub copied to clipboard

Go IDONTWANT

Open ppopth opened this issue 1 year ago • 10 comments
trafficstars

Sending IDONTWANT

  • [x] Implement a smart queue
  • [x] Add priorities to the smart queue
  • [x] Put IDONTWANT packets into the smart priority queue as soon as the node gets the packets

Handling IDONTWANT

  • [x] Use a map to remember the message ids whose IDONTWANT packets have been received
  • [x] Implement max_idontwant_messages (ignore the IDONWANT packets if the max is reached)
  • [x] Clear the message IDs from the cache after 3 heartbeats
  • [x] Hash the message IDs before putting them into the cache.

Spec: https://github.com/libp2p/specs/pull/548

ppopth avatar Jan 15 '24 18:01 ppopth

However, I think using a slice to back the queue is problematic at many levels as it might leak space with a growing array, and also hang on to too much space.

The unused space will be freed eventually https://stackoverflow.com/a/26863706

ppopth avatar Mar 02 '24 06:03 ppopth

I'm not really sure if we need dfd81e80848e5ce4e64fbf30cfdaafb4dc56b9d3 or not. I guess adding a non-standard flag to pb/trace.proto would break some users of the tracers?

ppopth avatar Mar 05 '24 09:03 ppopth

I'm not really sure if we need dfd81e8 or not. I guess adding a non-standard flag to pb/trace.proto would break some users of the tracers?

Yeah, let's not break compatibility with existing code, we don't need it.

Any progress other than that?

vyzo avatar Mar 22 '24 08:03 vyzo

Any progress other than that?

I haven't implemented the second half. Does the first half looks good to you?

ppopth avatar Apr 08 '24 09:04 ppopth

yes, looks reasonable so far.

vyzo avatar Apr 08 '24 09:04 vyzo

@vyzo I finished everything already. Sorry for the delay. I was in vacation and was a bit busy lately. Thank you for your patience.

ppopth avatar May 01 '24 18:05 ppopth

The message ids from IDONTWANT can be very large, so hashing it before keeping it in memory quite makes sense. The same thing is already implemented in nim-libp2p https://github.com/vacp2p/nim-libp2p/pull/1090

ppopth avatar May 03 '24 17:05 ppopth

sure, but it needs to be cleared, otherwise its memroy leak and dos waiting to happen. Or at best the feature breaks itself for long running nodes.

I suggest for each IDW keep a counter of how many heartbeats it survived, and after 3 (to match the IHAVE ads) it should be cleared.

vyzo avatar May 03 '24 17:05 vyzo

@vyzo It's all done. Sorry for the delay.

ppopth avatar Jun 11 '24 10:06 ppopth

ok, can you rebase on master?

vyzo avatar Jun 14 '24 16:06 vyzo

@vyzo wondering if this PR ready to be merged or are you waiting for anything to be fixed/addressed?

This seems to improve bandwidth usage and we want to take advantage of that.

chaitanyaprem avatar Aug 06 '24 15:08 chaitanyaprem

I think in general it looks good, but i do want to do another pass.

In the meantime:

* can you bump the version to 1.2

* can you add a feature test to not send IDONTWANT if the other side doesnt support it?

I will also invoke @Stebalien for review.

Haven't looked at the PR in detail, but will give it a try tomorrow.

chaitanyaprem avatar Aug 06 '24 15:08 chaitanyaprem

seems like we have some flaky tests :(

vyzo avatar Aug 14 '24 11:08 vyzo

@ppopth can you look at the go-check CI failure? Looks like the linter is complaining about some things.

vyzo avatar Aug 14 '24 11:08 vyzo

@vyzo could you rerun the CI? thanks

ppopth avatar Aug 14 '24 13:08 ppopth

done

vyzo avatar Aug 14 '24 13:08 vyzo

@vyzo The CI error log is not informative. Do you have any idea?

...
--- PASS: TestTopicPublishWithKeyInvalidParameters (0.08s)
    --- PASS: TestTopicPublishWithKeyInvalidParameters/nil_sign_private_key_should_error (0.00s)
    --- PASS: TestTopicPublishWithKeyInvalidParameters/empty_peer_ID_should_error (0.00s)
=== RUN   FuzzAppendOrMergeRPC
--- PASS: FuzzAppendOrMergeRPC (0.00s)
FAIL
FAIL	github.com/libp2p/go-libp2p-pubsub	385.852s
-test.shuffle 1723651417916320350
=== RUN   TestLastSeenCacheSlideForward
    last_seen_cache_test.go:36: timing is too fine grained to run in CI
--- SKIP: TestLastSeenCacheSlideForward (0.00s)
=== RUN   TestFirstSeenCacheNotFoundAfterExpire
--- PASS: TestFirstSeenCacheNotFoundAfterExpire (2.00s)
=== RUN   TestFirstSeenCacheFound
--- PASS: TestFirstSeenCacheFound (0.00s)
=== RUN   TestLastSeenCacheExpire
--- PASS: TestLastSeenCacheExpire (3.10s)
=== RUN   TestFirstSeenCacheExpire
--- PASS: TestFirstSeenCacheExpire (3.00s)
=== RUN   TestLastSeenCacheFound
--- PASS: TestLastSeenCacheFound (0.00s)
=== RUN   TestLastSeenCacheNotFoundAfterExpire
--- PASS: TestLastSeenCacheNotFoundAfterExpire (2.00s)
PASS
ok  	github.com/libp2p/go-libp2p-pubsub/timecache	10.111s
FAIL

ppopth avatar Aug 14 '24 16:08 ppopth

I think these are just flaky tests, i dont think your pr is breaking them.

vyzo avatar Aug 14 '24 17:08 vyzo

Regarding the test failures:

  • TestGossipsubConnTagMessageDeliveries this seems consistently flaky; let's add a t.Skip and open follow up issue to fix it.
  • FuzzAppendOrMergeRPC @ppopth this might need to be looked upon; if it is a flake, we can skip as well and open follow up issue, but might be a problem (with the test perhaps).

vyzo avatar Aug 15 '24 09:08 vyzo

@vyzo I didn't get any error with FuzzAppendOrMergeRPC, did you?

$ go test -v -run FuzzAppendOrMergeRPC
=== RUN   FuzzAppendOrMergeRPC
--- PASS: FuzzAppendOrMergeRPC (0.00s)
PASS
ok  	github.com/libp2p/go-libp2p-pubsub	0.005s

ppopth avatar Aug 15 '24 14:08 ppopth

thats a flake then. Ok, disable and open follow up issue!

vyzo avatar Aug 15 '24 15:08 vyzo

done

ppopth avatar Aug 15 '24 15:08 ppopth

CI failures - seems like a faulty commit.

vyzo avatar Aug 15 '24 16:08 vyzo

@ppopth let's reenable the Fuzz test, it seems it is fine: see #574

vyzo avatar Aug 15 '24 16:08 vyzo

@Stebalien @vyzo I think it's all done

ppopth avatar Aug 16 '24 06:08 ppopth

running the CI, will merge once it greens.

vyzo avatar Aug 16 '24 15:08 vyzo

should I squash the fix-up commits?

ppopth avatar Aug 16 '24 15:08 ppopth

should I squash the fix-up commits?

no need, I will squash merge anyway

vyzo avatar Aug 16 '24 15:08 vyzo