HotShot icon indicating copy to clipboard operation
HotShot copied to clipboard

[LIBP2P] - Delay sending `ResponseMessage::NotFound`

Open lukaszrzasik opened this issue 9 months ago • 0 comments

What is this task and why do we need to work on it?

Currently a node immediately responds with ResponseMessage::NotFound when it doesn't have necessary data to calculate VID. This behaviour might be problematic in situations when the node hasn't yet received DA proposal but it most probably will receive it soon.

What work will need to be done to complete this task?

Delay sending ResponseMessage::NotFound. Wait for a predetermined timeout in hope that the data required to calculate VID arrives and the node can respond with ResponseMessage::Found

Are there any other details to include?

No response

What are the acceptance criteria to close this issue?

This is hard to test because it's not deterministic. The most consistent way to reproduce the issue is to apply the following patch:

diff --git a/crates/hotshot/src/traits/networking/combined_network.rs b/crates/hotshot/src/traits/networking/combined_network.rs
index 3a56909..0ee74e3 100644
--- a/crates/hotshot/src/traits/networking/combined_network.rs
+++ b/crates/hotshot/src/traits/networking/combined_network.rs
@@ -502,7 +502,7 @@ impl<TYPES: NodeType> ConnectedNetwork<Message<TYPES>, TYPES::SignatureKey>
         });
         // View changed, let's start primary again
         self.primary_down.store(false, Ordering::Relaxed);
-        self.primary_fail_counter.store(0, Ordering::Relaxed);
+        // self.primary_fail_counter.store(0, Ordering::Relaxed);
     }
 
     fn is_primary_down(&self) -> bool {

and run the test_combined_network_cdn_crash test.

Branch work will be merged to (if not the default branch)

No response

lukaszrzasik avatar Apr 26 '24 12:04 lukaszrzasik