status-go
status-go copied to clipboard
bug: message stuck in sending status for a long time
Problem
Have noticed this multiple times and even during dogfooding session, a message that is sent whose status is still shown as sending. Eventhough the message is received by other side in 1:1 chat, group chat and in this case community.
Attaching the logs for reference.
MessageID of the message is 0xf37863ffcb2fcc6d7d05653a25db2ef4d808fa4e1a6cf584b5ddc4587ff82648
Even after 5 minutes, message is still shown as sending status.
Acceptance Criteria
Message should not be showing sending status for soo long especially if it is already delivered.
Notes
Uploading more logs @kaichaosun geth.log.gz
The timeline for message 0xf37863ffcb2fcc6d7d05653a25db2ef4d808fa4e1a6cf584b5ddc4587ff82648,
- sent out at ~11:04~ 11:01:22
- store hash query at ~11:07~ 11:04:33
- the success of query result at ~11:11~ 11:07:19
- EventEnvelopeSent triggered at ~11:11~ 11:07:19
The period of ~3s~ 3minutes between send out and send query request is ~expected~ unexpected, and it should take up to 6 seconds.
The period of ~4s~ 3minutes between send query request and get request response is also too long and unexpected.
~But it only takes around 7 seconds~, do you mean it's still showing sending after 5 minutes for this message or is there other message for this long delay? @chaitanyaprem
But it only takes around 7 seconds, do you mean it's still showing
sendingafter 5 minutes for this message or is there other message for this long delay? @chaitanyaprem
It was for this message, screenshot was taken after 5 minutes of sending the message around local time 11:06AM and it was still showing as sending. I did not notice when status has finally changed, but now (after an hour and half) i don't see sending any more.
It seems the store request takes really long, also the relay may also takes long since it's adding the message hash for query only after relay publish.
The request ID for hash query is 0x65d92cd709449fabc5f9681e38906f213556bf2572f673431ff1fb51a0a9b1be, @Ivansete-status you may know how to get the timeline of the query in nwaku server?
@chaitanyaprem It looks like the network connection can be the cause of the issue, or maybe I'm looking into the wrong branch, where can I find the version you are using for this test?
I find the some info in elastic, here
It takes pretty long to get the response.
Wow....that is a long time. cc @Ivansete-status store query took close to 3 mins.
I am also wondering what is the behaviour for the following scenario? The reason for this query is we have observed this behaviour even in case of 1:1 chat during our dogfooding session.
What if userA sends a message to UserB in 1:1 chat and mvds also acks receipt of message by UserB. Will we still depend on store-query to change sending status to sent at UserA's side? If so, then we need to fix it.
Ideally status can be changed by any sort of ack either by mvds, store-query or e2e protocol that is being built at a later point in time.
Will we still depend on store-query to change sending status to sent at UserA's side? If so, then we need to fix it.
No, the ack message will also change the state from outging (or sending) to sent.
Ideally status can be changed by any sort of ack either by mvds, store-query or e2e protocol that is being built at a later point in time.
This is a good idea, we need to encapsulate the interface for better usage in different reliability solutions.
Ideally status can be changed by any sort of ack either by mvds, store-query or e2e protocol that is being built at a later point in time.
This is a good idea, we need to encapsulate the interface for better usage in different reliability solutions.
Are we tracking this somewhere?
@fryorcraken not yet, it could be one of the requirements when implementing e2e reliability. I believe it may include a lot of refactors within status-go, for example mvds. Or we can make e2e reliability and mvds both in waku-sdk.
As it can be seen in the description of the following issue, the messageHash queries are taking more than 5 minutes, on average (see the second query:) https://github.com/waku-org/nwaku/issues/2895
@kaichaosun , @chaitanyaprem - ~~is this happening only with store-v3?~~ edited: I confirmed that only happens from store-v3
This is fixed with the improvements in nwaku.
Hey @Ivansete-status I noticed there are store request takes more than 30s, is it expected?
Hi @kaichaosun ! Yes, we have right now queries that take some minutes. That will be fixed when https://github.com/waku-org/nwaku/issues/2895 is deployed ( this month , August )
@chaitanyaprem - a new version with a fix candidate has been deployed in status.prod. Kindly check whether the issue is sorted out. Thanks!
