status-go icon indicating copy to clipboard operation
status-go copied to clipboard

retry sending specific messages

Open qfrank opened this issue 1 year ago • 1 comments

fix mobile issue

status: WIP

qfrank avatar Mar 22 '24 10:03 qfrank

Jenkins Builds

Click to see older builds (84)
:grey_question: Commit :hash: Finished (UTC) Duration Platform Result
:heavy_check_mark: 6c35a8f0 #1 2024-03-22 11:01:33 ~3 min linux :package:zip
:heavy_check_mark: 6c35a8f0 #1 2024-03-22 11:02:48 ~5 min ios :package:zip
:heavy_check_mark: 6c35a8f0 #1 2024-03-22 11:03:22 ~5 min android :package:aar
:heavy_multiplication_x: 6c35a8f0 #1 2024-03-22 11:18:22 ~20 min tests :page_facing_up:log
:heavy_check_mark: 0d167d2b #2 2024-03-22 13:41:20 ~2 min android :package:aar
:heavy_multiplication_x: 0d167d2b #2 2024-03-22 13:42:00 ~2 min tests :page_facing_up:log
:heavy_check_mark: 0d167d2b #2 2024-03-22 13:42:27 ~3 min linux :package:zip
:heavy_check_mark: 0d167d2b #2 2024-03-22 13:45:20 ~6 min ios :package:zip
:x: edac513b #3 2024-03-28 13:54:47 ~24 sec ios :page_facing_up:log
:x: edac513b #3 2024-03-28 13:56:32 ~2 min android :page_facing_up:log
:x: edac513b #3 2024-03-28 13:56:52 ~2 min linux :page_facing_up:log
:heavy_multiplication_x: edac513b #3 2024-03-28 13:58:32 ~4 min tests :page_facing_up:log
:heavy_check_mark: d598e49e #4 2024-03-28 14:14:36 ~3 min linux :package:zip
:heavy_check_mark: d598e49e #4 2024-03-28 14:14:44 ~3 min ios :package:zip
:heavy_check_mark: d598e49e #4 2024-03-28 14:16:53 ~5 min android :package:aar
:heavy_multiplication_x: d598e49e #4 2024-03-28 14:49:11 ~38 min tests :page_facing_up:log
:heavy_multiplication_x: d598e49e #5 2024-03-28 15:49:43 ~36 min tests :page_facing_up:log
:heavy_check_mark: 6baff870 #5 2024-03-30 09:44:42 ~2 min linux :package:zip
:heavy_check_mark: 6baff870 #5 2024-03-30 09:46:22 ~3 min ios :package:zip
:heavy_check_mark: 6baff870 #5 2024-03-30 09:46:53 ~4 min android :package:aar
:heavy_multiplication_x: 6baff870 #6 2024-03-30 10:16:41 ~34 min tests :page_facing_up:log
:heavy_multiplication_x: 2fbae936 #7 2024-03-31 02:45:26 ~51 sec tests :page_facing_up:log
:heavy_check_mark: 2fbae936 #6 2024-03-31 02:46:15 ~1 min linux :package:zip
:heavy_check_mark: 2fbae936 #6 2024-03-31 02:47:18 ~2 min android :package:aar
:heavy_check_mark: 2fbae936 #6 2024-03-31 02:48:04 ~3 min ios :package:zip
:heavy_check_mark: c884f3c7 #7 2024-03-31 02:59:18 ~1 min linux :package:zip
:heavy_check_mark: c884f3c7 #7 2024-03-31 02:59:55 ~2 min android :package:aar
:heavy_check_mark: c884f3c7 #7 2024-03-31 03:00:53 ~3 min ios :package:zip
:heavy_check_mark: c884f3c7 #8 2024-03-31 03:38:21 ~40 min tests :page_facing_up:log
:heavy_check_mark: 421c58bf #8 2024-04-01 02:00:32 ~1 min linux :package:zip
:heavy_check_mark: 421c58bf #8 2024-04-01 02:01:29 ~2 min android :package:aar
:heavy_check_mark: 421c58bf #8 2024-04-01 02:02:56 ~3 min ios :package:zip
:heavy_check_mark: 421c58bf #9 2024-04-01 02:39:20 ~40 min tests :page_facing_up:log
:heavy_multiplication_x: 74ee7ef0 #10 2024-04-01 04:05:52 ~1 min tests :page_facing_up:log
:heavy_check_mark: 74ee7ef0 #9 2024-04-01 04:06:35 ~2 min android :package:aar
:heavy_check_mark: 74ee7ef0 #9 2024-04-01 04:06:37 ~2 min linux :package:zip
:heavy_check_mark: 74ee7ef0 #9 2024-04-01 04:07:43 ~3 min ios :package:zip
:heavy_check_mark: d234dd0e #10 2024-04-01 04:16:57 ~1 min android :package:aar
:heavy_check_mark: d234dd0e #10 2024-04-01 04:17:18 ~2 min linux :package:zip
:heavy_check_mark: d234dd0e #10 2024-04-01 04:18:07 ~2 min ios :package:zip
:heavy_check_mark: d234dd0e #11 2024-04-01 04:54:14 ~39 min tests :page_facing_up:log
:heavy_check_mark: f43babdd #11 2024-04-01 08:24:29 ~1 min linux :package:zip
:heavy_check_mark: f43babdd #11 2024-04-01 08:24:40 ~1 min android :package:aar
:heavy_check_mark: f43babdd #11 2024-04-01 08:26:23 ~3 min ios :package:zip
:heavy_check_mark: f43babdd #12 2024-04-01 09:03:06 ~40 min tests :page_facing_up:log
:heavy_multiplication_x: b17aa495 #13 2024-04-06 02:21:41 ~1 min tests :page_facing_up:log
:heavy_check_mark: b17aa495 #12 2024-04-06 02:22:01 ~1 min linux :package:zip
:heavy_check_mark: b17aa495 #12 2024-04-06 02:22:44 ~2 min android :package:aar
:heavy_check_mark: b17aa495 #12 2024-04-06 02:23:49 ~3 min ios :package:zip
:heavy_multiplication_x: 17ea9ed7 #14 2024-04-06 02:24:21 ~56 sec tests :page_facing_up:log
:heavy_check_mark: 17ea9ed7 #13 2024-04-06 02:24:56 ~1 min linux :package:zip
:heavy_check_mark: 17ea9ed7 #13 2024-04-06 02:25:31 ~2 min android :package:aar
:heavy_check_mark: 17ea9ed7 #13 2024-04-06 02:27:22 ~3 min ios :package:zip
:heavy_check_mark: 38650027 #14 2024-04-06 02:32:20 ~1 min linux :package:zip
:heavy_check_mark: 38650027 #14 2024-04-06 02:32:34 ~1 min android :package:aar
:heavy_check_mark: 38650027 #14 2024-04-06 02:33:51 ~2 min ios :package:zip
:heavy_check_mark: 38650027 #15 2024-04-06 03:11:05 ~40 min tests :page_facing_up:log
:heavy_check_mark: d4876736 #15 2024-04-06 23:48:52 ~1 min linux :package:zip
:heavy_check_mark: d4876736 #15 2024-04-06 23:49:11 ~2 min android :package:aar
:heavy_check_mark: d4876736 #15 2024-04-06 23:50:40 ~3 min ios :package:zip
:heavy_check_mark: d4876736 #16 2024-04-07 00:27:28 ~40 min tests :page_facing_up:log
:heavy_check_mark: 3672759b #16 2024-04-08 13:01:06 ~1 min linux :package:zip
:heavy_check_mark: 3672759b #16 2024-04-08 13:03:49 ~4 min android :package:aar
:heavy_check_mark: 3672759b #17 2024-04-08 13:40:26 ~41 min tests :page_facing_up:log
:heavy_check_mark: a238ba30 #17 2024-04-09 12:22:43 ~3 min linux :package:zip
:heavy_check_mark: a238ba30 #17 2024-04-09 12:24:40 ~5 min android :package:aar
:heavy_multiplication_x: a238ba30 #18 2024-04-09 12:25:07 ~6 min tests :page_facing_up:log
:heavy_check_mark: a238ba30 #19 2024-04-09 13:12:11 ~40 min tests :page_facing_up:log
:heavy_check_mark: 03a1800dfadabdd9c8b17b89638106c6b86b25ca #18 2024-04-10 01:59:18 ~3 min linux :package:zip
:heavy_check_mark: 03a1800dfadabdd9c8b17b89638106c6b86b25ca #18 2024-04-10 02:00:37 ~5 min ios :package:zip
:heavy_check_mark: 03a1800dfadabdd9c8b17b89638106c6b86b25ca #18 2024-04-10 02:00:54 ~5 min android :package:aar
:heavy_multiplication_x: 03a1800dfadabdd9c8b17b89638106c6b86b25ca #20 2024-04-10 02:00:56 ~5 min tests :page_facing_up:log
:heavy_check_mark: a3071bd176bc17c6960c44ffe8bd7d7160014904 #19 2024-04-10 02:20:34 ~1 min android :package:aar
:heavy_check_mark: a3071bd176bc17c6960c44ffe8bd7d7160014904 #19 2024-04-10 02:21:03 ~2 min linux :package:zip
:heavy_check_mark: a3071bd176bc17c6960c44ffe8bd7d7160014904 #19 2024-04-10 02:22:03 ~3 min ios :package:zip
:heavy_check_mark: a3071bd176bc17c6960c44ffe8bd7d7160014904 #21 2024-04-10 02:58:28 ~39 min tests :page_facing_up:log
:heavy_multiplication_x: fa3a38a4af4f8cb92e60140188cecadb91adf647 #22 2024-04-15 08:35:11 ~58 sec tests :page_facing_up:log
:heavy_check_mark: fa3a38a4af4f8cb92e60140188cecadb91adf647 #20 2024-04-15 08:38:12 ~4 min ios :package:zip
:heavy_check_mark: fa3a38a4af4f8cb92e60140188cecadb91adf647 #20 2024-04-15 08:38:54 ~4 min linux :package:zip
:heavy_check_mark: fa3a38a4af4f8cb92e60140188cecadb91adf647 #20 2024-04-15 08:40:09 ~6 min android :package:aar
:heavy_check_mark: 694e4a290db04cfac6746d4fc327d6207aefbd5a #21 2024-04-15 08:57:09 ~4 min linux :package:zip
:heavy_check_mark: 694e4a290db04cfac6746d4fc327d6207aefbd5a #21 2024-04-15 08:57:39 ~4 min ios :package:zip
:heavy_check_mark: 694e4a290db04cfac6746d4fc327d6207aefbd5a #21 2024-04-15 08:57:50 ~4 min android :package:aar
:heavy_check_mark: 694e4a290db04cfac6746d4fc327d6207aefbd5a #23 2024-04-15 09:41:10 ~47 min tests :page_facing_up:log
:grey_question: Commit :hash: Finished (UTC) Duration Platform Result
:heavy_check_mark: ca414a1be999fd012017a261be095299f8e10799 #22 2024-04-30 07:08:37 ~3 min linux :package:zip
:heavy_check_mark: ca414a1be999fd012017a261be095299f8e10799 #22 2024-04-30 07:09:58 ~5 min ios :package:zip
:heavy_check_mark: ca414a1be999fd012017a261be095299f8e10799 #22 2024-04-30 07:10:35 ~5 min android :package:aar
:heavy_check_mark: ca414a1be999fd012017a261be095299f8e10799 #24 2024-04-30 07:47:39 ~42 min tests :page_facing_up:log
:heavy_check_mark: afbb823b7f6373d7613252c1313bb066c79c1f3c #23 2024-05-01 00:44:07 ~2 min linux :package:zip
:heavy_check_mark: afbb823b7f6373d7613252c1313bb066c79c1f3c #23 2024-05-01 00:44:22 ~2 min android :package:aar
:heavy_check_mark: afbb823b7f6373d7613252c1313bb066c79c1f3c #23 2024-05-01 00:45:31 ~3 min ios :package:zip
:heavy_check_mark: afbb823b7f6373d7613252c1313bb066c79c1f3c #25 2024-05-01 01:24:21 ~42 min tests :page_facing_up:log

status-im-auto avatar Mar 22 '24 11:03 status-im-auto

thanks to @osmaczko , leave chat history with @osmaczko here as reference later cc @cammellos :

Hi Patryk, maybe you know something about this:

I'm struggling with SendMessageToControlNode , it will use SendCommunityMessage or SendPrivate depend on !community.PublicKey().Equal(community.ControlNode()) , we have 4 message types:

ApplicationMetadataMessage_COMMUNITY_REQUEST_TO_JOIN
ApplicationMetadataMessage_COMMUNITY_EDIT_SHARED_ADDRESSES
ApplicationMetadataMessage_COMMUNITY_CANCEL_REQUEST_TO_JOIN
ApplicationMetadataMessage_COMMUNITY_REQUEST_TO_LEAVE

which use SendMessageToControlNode , if the user is ControlNode, are we allowing ControlNode to send these 4 type of messages? should not allow IMO, so we can just use SendPrivate directly rather than SendMessageToControlNode, because the user use SendPrivate is not ControlNode ?

Relate PR: https://github.com/status-im/status-go/pull/4969

Relate code/comment:

https://github.com/status-im/status-go/blob/74ee7ef086a2026e20c2436f27ef2973d1a6d599/protocol/messenger_communities.go#L1167

https://github.com/status-im/status-go/blob/74ee7ef086a2026e20c2436f27ef2973d1a6d599/protocol/messenger_communities.go#L1173

https://github.com/status-im/status-go/blob/74ee7ef086a2026e20c2436f27ef2973d1a6d599/protocol/messenger_communities.go#L1190

https://github.com/status-im/status-go/blob/74ee7ef086a2026e20c2436f27ef2973d1a6d599/protocol/messenger_communities.go#L1344

https://github.com/status-im/status-go/blob/74ee7ef086a2026e20c2436f27ef2973d1a6d599/protocol/messenger_communities.go#L1510

Hey Frank,

if the user is ControlNode, are we allowing ControlNode to send these 4 type of messages? I don't think ControlNode should ever send these types of messages, as they are to be triggered by users. I don't think there is an explicit disallowance in status-go, but there is no way in the client to send a request to join if one is a control node already.

so we can just use SendPrivate directly rather than SendMessageToControlNode For communities without owner token minted, all messages should be sent to community pubkey rather than to certain user. That's why we have SendMessageToControlNode helper, it determines, whether to send to the user (onwer) directly or to community id.

About the related comment: "TODO(frank): need support resending following message? if yes then we need to ensure SendMessageToControlNode will invoke SendPrivate internally"

If there are privileged members in the community, this should always be true: if !community.PublicKey().Equal(community.ControlNode()) {. That's because only holder of owner-token can nominate privileged members.

Btw. Why do we need to ensure SendMessageToControlNode will invoke SendPrivate?

Hi Patryk , thank you for your explaination!

Why do we need to ensure SendMessageToControlNode will invoke SendPrivate?

Because when i was trying to implement the resending logical, considering what the ResendMethod of a RawMessage is the key, we already have SendPrivate and SendCommunityMessage and i want to reuse these 2 functions, if SendMessageToControlNode is equal to SendPrivate , implementation would be much simpler.

Let's take following code as an example to discuss, https://github.com/status-im/status-go/blob/74ee7ef086a2026e20c2436f27ef2973d1a6d599/protocol/messenger_communities.go#L1167 we can see that it used SendPrivate to send the raw message, and it reused the same variable rawMessage as param which means they will have the same value on rawMessage.ID, we have a field named Recipients for RawMessage which is used to determine who is the receiver(could be multi) of the raw message. If the same raw message always use SendPrivate to send, it's okay, but.. now think about this, we have a raw message that ID is 0x1, it could be sent with SendPrivate and SendCommunityMessage at the same time because SendMessageToControlNode could use SendPrivate or SendCommunityMessage! What are we going to do with such case?

we have a raw message that ID is 0x1, it could be sent with SendPrivate and SendCommunityMessage at the same time because SendMessageToControlNode could use SendPrivate or SendCommunityMessage! What are we going to do with such case?

What I tried to say last time, is that, it should not happen with the current flow. If SendMessageToControlNode uses SendCommunityMessage then privilegedMembers should be empty. In other words, if we use SendCommunityMessage, then sendPrivate is not used. If it is, then this is a bug most likely.

Let me explain more in deep: Community is simply a keypair. When we create it, all messages to Community are sent through SendCommunityMessage, where recipient is Community pubkey. At that point, the sendPrivate should not be used. This is for historical reasons, because before, we couldn't deterministically obtain the owner pubkey of the community, so all users sent requests to join to Community pubkey, rather than to certain member. Also the transfer of ownership was different, we was able to re-import community by using another account.

Right now, with owner-token concept, the owner of the community is the one who holds the token and we have a mechanism to deterministically obtain the chat key of that owner. Also, the private key of the community is gone once the owner-token is minted. That's why we sendPrivate directly to owners in such communities.

Since we dropped import by private key, we could, in theory, use sendPrivate in communities without owner-token minted and send messages to the user rather than Community pubkey. Although, I wouldn't go that direction, because it will most likely break a lot of things on the way.

qfrank avatar Apr 06 '24 02:04 qfrank

You may noticed messenger_raw_message_resend_test.go is placed in package api, this looks like a mess, however, if i moved it into package protocol, it would trigger import cycle issue (it's about the usage of backend and messenger). I prefer fix such mess in a separate PR, WDYT?

qfrank avatar Apr 07 '24 00:04 qfrank

When run TestMessengerRawMessageResendTestSuite/TestMessageSent locally for 100 times, there is a possible to get following error:

        	Error:      	Received unexpected error:
        	            	MessengerResponse data not received
        	Test:       	TestMessengerRawMessageResendTestSuite/TestMessageSent

relate test code:

	waitOnMessengerResponse(&s.Suite, func(r *protocol.MessengerResponse) error {
		for _, m := range r.Messages() {
			if m.GetContentType() == protobuf.ChatMessage_CONTACT_REQUEST {
				contactRequest = m
				return nil
			}
		}
		return errors.New("contact request not received")
	}, s.bobMessenger)

After checking code and geth.log , I found the relate error:

ERROR[04-09|13:38:50.563|github.com/status-im/status-go/wakuv2/waku.go:1003]                                could not send message                   envelopeHash=0x2fad60891798d51c94298ae6cdc9ca1d54be4d8d6c52c5739c08bf77388bbeed pubsubTopic=/waku/2/rs/16/32 contentTopic=/waku/1/0xaeffe08b/rfc26 timestamp=1,712,641,130,562,699,000 error="not enough peers to publish"
image image

If you know something on this, do tell me, thank you! @vitvly cc @cammellos

qfrank avatar Apr 09 '24 06:04 qfrank

When run TestMessengerRawMessageResendTestSuite/TestMessageSent locally for 100 times, there is a possible to get following error:

        	Error:      	Received unexpected error:
        	            	MessengerResponse data not received
        	Test:       	TestMessengerRawMessageResendTestSuite/TestMessageSent

relate test code:

	waitOnMessengerResponse(&s.Suite, func(r *protocol.MessengerResponse) error {
		for _, m := range r.Messages() {
			if m.GetContentType() == protobuf.ChatMessage_CONTACT_REQUEST {
				contactRequest = m
				return nil
			}
		}
		return errors.New("contact request not received")
	}, s.bobMessenger)

After checking code and geth.log , I found the relate error:

ERROR[04-09|13:38:50.563|github.com/status-im/status-go/wakuv2/waku.go:1003]                                could not send message                   envelopeHash=0x2fad60891798d51c94298ae6cdc9ca1d54be4d8d6c52c5739c08bf77388bbeed pubsubTopic=/waku/2/rs/16/32 contentTopic=/waku/1/0xaeffe08b/rfc26 timestamp=1,712,641,130,562,699,000 error="not enough peers to publish"

image image If you know something on this, do tell me, thank you! @vitvly cc @cammellos

Solved by custom DiscV5BootstrapNode, if you have better solution, welcome! cc @richard-ramos

qfrank avatar Apr 09 '24 12:04 qfrank

Unfortunately, the error still can be reproduced, but the frequency seems lower than before(6/100). @richard-ramos

    messenger_raw_message_resend_test.go:292: response error: contact request not received
    messenger_raw_message_resend_test.go:298: 
        	Error Trace:	/Users/frank/development/go/src/github.com/status-im/status-go/api/messenger_raw_message_resend_test.go:298
        	            				/Users/frank/development/go/src/github.com/status-im/status-go/api/messenger_raw_message_resend_test.go:260
        	            				/Users/frank/development/go/src/github.com/status-im/status-go/api/messenger_raw_message_resend_test.go:61
        	            				/Users/frank/development/go/src/github.com/status-im/status-go/vendor/github.com/stretchr/testify/suite/suite.go:187
        	Error:      	Received unexpected error:
        	            	MessengerResponse data not received
        	Test:       	TestMessengerRawMessageResendTestSuite/TestMessageSent

qfrank avatar Apr 10 '24 02:04 qfrank

I have run this a couple of times succesfully:

go test -run TestMessengerRawMessageResendTestSuite/TestMessageSent github.com/status-im/status-go/api -count 100 -v -timeout 10000s

please share the logs for when there's a failure. I'm thinking that adding a 2s sleep in the test after initializing the nodes should help, since WakuRelay takes some time to form a mesh

richard-ramos avatar Apr 10 '24 07:04 richard-ramos

I have run this a couple of times succesfully:

go test -run TestMessengerRawMessageResendTestSuite/TestMessageSent github.com/status-im/status-go/api -count 100 -v -timeout 10000s

please share the logs for when there's a failure. I'm thinking that adding a 2s sleep in the test after initializing the nodes should help, since WakuRelay takes some time to form a mesh

image emm.. this time got less, only 2/100.

geth.log.zip @richard-ramos

qfrank avatar Apr 10 '24 08:04 qfrank

more specific and smaller range logs: cut_console.log cut_geth.log @richard-ramos

qfrank avatar Apr 12 '24 04:04 qfrank

FYI, append these lines into createAliceBobBackendAndLogin fixed issue:

        aliceWaku := s.aliceBackend.StatusNode().WakuV2Service()
	bobWaku := s.bobBackend.StatusNode().WakuV2Service()
	// NOTE: default MaxInterval is 10s, which is too short for the test
	err = tt.RetryWithBackOff(func() error {
		if len(aliceWaku.Peerstore().Addrs(bobWaku.PeerID())) > 0 {
			return nil
		}
		s.T().Logf("alice don't know bob's addresses")
		return errors.New("alice don't know bob's addresses")
	}, func(b *backoff.ExponentialBackOff){b.MaxInterval = 20 * time.Second})
	s.Require().NoError(err)
	err = tt.RetryWithBackOff(func() error {
		if len(bobWaku.Peerstore().Addrs(aliceWaku.PeerID())) > 0 {
			return nil
		}
		s.T().Logf("bob don't know alice's addresses")
		return errors.New("bob don't know alice's addresses")
	}, func(b *backoff.ExponentialBackOff){b.MaxInterval = 20 * time.Second})
	s.Require().NoError(err)
	s.Require().NoError(aliceWaku.DialPeerByID(bobWaku.PeerID().String()))
	s.Require().NoError(bobWaku.DialPeerByID(aliceWaku.PeerID().String()))

cc @richard-ramos @cammellos @vitvly @chaitanyaprem cheers!

qfrank avatar Apr 30 '24 15:04 qfrank