lighthouse icon indicating copy to clipboard operation
lighthouse copied to clipboard

Lighthouse unable to reliably serve `DataColumnsByRange`

Open jimmygchen opened this issue 7 months ago • 0 comments

Description

When testing sync locally, a syncing Lighthouse node isn't able to download data columns from it's Lighthouse peer reliably. The network seems to be functioning fine with 100% participation and all peers in sync. However the peer returns 0 columns most of the time. See logs below.

To reproduce:

  1. Start a local testnet with the network_params_das_local.yaml config
  2. Stop one Lighthouse node, and wait for 2-3 epochs to make sure it triggers range sync
  3. Start the Lighthouse node, notice that sync gets stuck pretty quickly and peers don't return columns

Version

das branch

Present Behaviour

Logs from syncing node:

Jul 15 07:27:09.430 DEBG Sending DataColumnsByRange requests, peer: 16Uiu2HAm4HH1Bioy2Qp75Rymx2X6TZwTnWeGwmjLKMHAZ5evVmcc, columns: [16, 24, 50, 64, 80, 88], epoch: 0, count: 32, method: DataColumnsByRange, service: sync, module: network::sync::network_context:398
Jul 15 07:27:09.430 DEBG Sending DataColumnsByRange requests, peer: 16Uiu2HAmKxsGh6atb4WQDVz3gb37XaJbi3MZNW8ZRsSwkBu7URYT, columns: [0, 114], epoch: 0, count: 32, method: DataColumnsByRange, service: sync, module: network::sync::network_context:398
Jul 15 07:27:09.430 DEBG Sending DataColumnsByRange requests, peer: 16Uiu2HAm4HH1Bioy2Qp75Rymx2X6TZwTnWeGwmjLKMHAZ5evVmcc, columns: [0, 16, 24, 50, 64, 80, 88, 114], epoch: 1, count: 32, method: DataColumnsByRange, service: sync, module: network::sync::network_context:398
Jul 15 07:27:09.431 DEBG Sending DataColumnsByRange requests, peer: 16Uiu2HAm4HH1Bioy2Qp75Rymx2X6TZwTnWeGwmjLKMHAZ5evVmcc, columns: [0, 16, 24, 80, 88], epoch: 2, count: 32, method: DataColumnsByRange, service: sync, module: network::sync::network_context:398
Jul 15 07:27:09.431 DEBG Sending DataColumnsByRange requests, peer: 16Uiu2HAmKxsGh6atb4WQDVz3gb37XaJbi3MZNW8ZRsSwkBu7URYT, columns: [50, 64, 114], epoch: 2, count: 32, method: DataColumnsByRange, service: sync, module: network::sync::network_context:398
Jul 15 07:27:09.431 DEBG Sending DataColumnsByRange requests, peer: 16Uiu2HAmKxsGh6atb4WQDVz3gb37XaJbi3MZNW8ZRsSwkBu7URYT, columns: [0, 24, 50, 64, 88, 114], epoch: 3, count: 32, method: DataColumnsByRange, service: sync, module: network::sync::network_context:398
Jul 15 07:27:09.431 DEBG Sending DataColumnsByRange requests, peer: 16Uiu2HAm4HH1Bioy2Qp75Rymx2X6TZwTnWeGwmjLKMHAZ5evVmcc, columns: [16, 80], epoch: 3, count: 32, method: DataColumnsByRange, service: sync, module: network::sync::network_context:398

Logs from supernode peer (16Uiu2HAm4HH1Bioy2Qp75Rymx2X6TZwTnWeGwmjLKMHAZ5evVmcc): Note that only the request (0-32) returned some data, 114 data columns ~ 19 blocks, given 6 columns requested.

Jul 15 07:27:09.437 DEBG Received DataColumnsByRange Request, start_slot: 32, count: 32, peer_id: 16Uiu2HAmTBJCHSjTZDGRKAU5vaq3wgomDW7FZ8SpJbK5xWe4N6CY, module: network::network_beacon_processor::rpc_methods:960
Jul 15 07:27:09.438 DEBG DataColumnsByRange Response processed, returned: 0, requested: 32, current_slot: 133, start_slot: 32, peer: 16Uiu2HAmTBJCHSjTZDGRKAU5vaq3wgomDW7FZ8SpJbK5xWe4N6CY, module: network::network_beacon_processor::rpc_methods:1119
Jul 15 07:27:09.439 DEBG Received DataColumnsByRange Request, start_slot: 0, count: 32, peer_id: 16Uiu2HAmTBJCHSjTZDGRKAU5vaq3wgomDW7FZ8SpJbK5xWe4N6CY, module: network::network_beacon_processor::rpc_methods:960
Jul 15 07:27:09.449 DEBG DataColumnsByRange Response processed, returned: 114, requested: 32, current_slot: 133, start_slot: 0, peer: 16Uiu2HAmTBJCHSjTZDGRKAU5vaq3wgomDW7FZ8SpJbK5xWe4N6CY, module: network::network_beacon_processor::rpc_methods:1119
Jul 15 07:27:09.450 DEBG Received DataColumnsByRange Request, start_slot: 96, count: 32, peer_id: 16Uiu2HAmTBJCHSjTZDGRKAU5vaq3wgomDW7FZ8SpJbK5xWe4N6CY, module: network::network_beacon_processor::rpc_methods:960
Jul 15 07:27:09.450 DEBG DataColumnsByRange Response processed, returned: 0, requested: 32, current_slot: 133, start_slot: 96, peer: 16Uiu2HAmTBJCHSjTZDGRKAU5vaq3wgomDW7FZ8SpJbK5xWe4N6CY, module: network::network_beacon_processor::rpc_methods:1119
Jul 15 07:27:09.450 DEBG Received DataColumnsByRange Request, start_slot: 64, count: 32, peer_id: 16Uiu2HAmTBJCHSjTZDGRKAU5vaq3wgomDW7FZ8SpJbK5xWe4N6CY, module: network::network_beacon_processor::rpc_methods:960
Jul 15 07:27:09.451 DEBG DataColumnsByRange Response processed, returned: 0, requested: 32, current_slot: 133, start_slot: 64, peer: 16Uiu2HAmTBJCHSjTZDGRKAU5vaq3wgomDW7FZ8SpJbK5xWe4N6CY, module: network::network_beacon_processor::rpc_methods:1119

jimmygchen avatar Jul 16 '24 06:07 jimmygchen