freenet-core icon indicating copy to clipboard operation
freenet-core copied to clipboard

Add simple PUT retry logic when initial peer doesn't acknowledge

Open sanity opened this issue 5 months ago • 1 comments

Simple PUT Retry Logic

Problem

When a client initiates a PUT operation, if the selected target peer doesn't respond with SuccessfulPut, the operation fails permanently. This causes:

  • River chat updates to fail silently
  • Poor user experience when a single peer is unreachable
  • Unnecessary failures when alternative peers are available

Current Behavior

// In request_put() 
let target = op_manager
    .ring
    .closest_potentially_caching(&key, [&sender.peer].as_slice())
    .into_iter()
    .next()
    .ok_or(RingError::EmptyRing)?;

// Send RequestPut to target
// If no SuccessfulPut received → operation fails permanently

Proposed Solution

Add simple retry logic with alternative peers:

pub struct PutState {
    // ... existing fields ...
    AwaitingResponse {
        key: ContractKey,
        // ... other fields ...
        retry_count: usize,
        tried_peers: HashSet<PeerId>,
    }
}

// When timeout occurs (no SuccessfulPut within ~500ms-2s):
if retry_count < MAX_RETRIES {
    // Get alternative peer
    let candidates = op_manager
        .ring
        .k_closest_potentially_caching(&key, &tried_peers, 5);
    
    if let Some(next_peer) = candidates.first() {
        // Send RequestPut to next_peer
        // Increment retry_count
        // Add current peer to tried_peers
    }
}

Key Points

  1. Simple retry only: We only retry the initial PUT request to the first peer
  2. No propagation tracking: Once any peer sends SuccessfulPut, they have responsibility
  3. Fast timeout: 500ms-2s per attempt (not the 60-second operation TTL)
  4. Limited retries: ~5-10 attempts max to avoid infinite loops

Implementation Approach

  1. Add retry fields to PutState::AwaitingResponse
  2. Add timeout detection in PUT operation processing
  3. On timeout, select next peer and retry
  4. On SuccessfulPut, complete operation normally

Success Criteria

  • PUT operations succeed even when initial target peer is unreachable
  • No changes to PUT propagation logic
  • No protocol changes required
  • Simple, minimal code changes

Priority

High - This directly impacts River chat reliability and user experience

sanity avatar Jun 14 '25 16:06 sanity

Looks good.

iduartgomez avatar Jun 14 '25 16:06 iduartgomez

Fixed in a different way.

iduartgomez avatar Oct 02 '25 17:10 iduartgomez