Dragonfly2 icon indicating copy to clipboard operation
Dragonfly2 copied to clipboard

[RFC] Parent selection based on node state awareness

Open baowj-678 opened this issue 1 year ago • 5 comments

Introduction

Feature request:

Dragonfly is an efficient, stable, and secure file distribution and image acceleration tool based on P2P. However, currently the Parent selection method for downloading Dragonfly file pieces is based on the FCFS method (i.e., a certain Piece Metadata is obtained from which Parent first, and the corresponding Piece is downloaded from that Parent). This node selection method cannot dynamically perceive changes in Parent node state (network bandwidth, disk IO) and cannot fully utilize bandwidth resources.

Therefore, I propose a download node selection method based on Parent state awareness, which will be introduced in detail below.

Use case:

UI Example:

Design

Architecture

The following is the overall architecture diagram of the design, mainly including ParentStateSyncer, ParentStateServer and PieceCollector are three parts.

dragonfly-带宽感知-实现 drawio

Modules

  • ParentStateServer: The backend daemon thread on the upload server side, which periodically counts the local network bandwidth and disk bandwidth then calculates the local node state, and sends the latest state to each connection in the SyncHost connection set maintained by LRU;
  • ParentStateSyncer: The backend daemon thread on the client side, which uses LRU cache to maintain the set of Parents that need to synchronize their states, and sends SyncHost requests to synchronize all parent statuses in the cache;
  • PieceCollector: Retrieve the states of the parents being followed from ParentStateSyncer, select the optimal download parent and its corresponding piece;

Download Process

  1. Start downloading, PieceCollector registers the scheduled parents into the LRU cache of ParentStateSyncer;
  2. ParentStateSyncer synchronizes the parents' states from ParentStateServer in the background;
  3. **PieceCollector **periodically updates the state of the parents it focuses on from ParentStateSyncer;
  4. PieceCollector obtains the scheduled parents and their corresponding pieces based on the node selection method;

Node Selection Method

dragonfly-带宽感知-算法 drawio

  1. The downloaded piece-metadata is saved in different queues according to the parent;
  2. Based on the parent status, use a random number to select a parent;
  3. [normal case] If there is piece-metadata in the parent queue, select the first element directly;
  4. [queue empty] Select the next parent queue in order;
  5. [piece finished] Skip until the queued piece has not been downloaded or the queue is empty;

Configuration

upload:
  # configuration for HostSyncer
  syncer:
    # enable indicates whether enable HostSyncer.
    enable: true
    # intervalis the interval to sync hosts' info.
    interval: 3s
    # cache_capacity is the capacity of the cache by LRU algorithm for HostSyncer grpc connection, default is 50.
    cacheCapacity: 50

API Definition

message SyncHostRequest {
  // Host id.
  string host_id = 1;
  // Peer id.
  string peer_id = 2;
}

// DfdaemonUpload represents upload service of dfdaemon.
service DfdaemonUpload{
  // SyncHost sync parents state.
  rpc SyncHost(SyncHostRequest) returns (stream common.v2.Host);
}

baowj-678 avatar Dec 19 '24 11:12 baowj-678

actions:

  • api define, week1
  • configuration, week1
  • upload server, week1
  • parent selector, week2
  • piece collector, week2
  • test, unit test & e2e test & stress test, week3

baowj-678 avatar Dec 24 '24 06:12 baowj-678

Impressive design! Regarding the part about perceive changes in Parent node state, collecting node metrics is a complex task. Perhaps we could consider using the OpenTelemetry metrics to interface with the data collection daemons which usually already exist in most production environments, rather than implementing this part ourselves. This way, we only need to design a mechanism that adjusts the scheduling weight of the current node based on metrics.

CormickKneey avatar Dec 24 '24 06:12 CormickKneey

Impressive design! Regarding the part about perceive changes in Parent node state, collecting node metrics is a complex task. Perhaps we could consider using the OpenTelemetry metrics to interface with the data collection daemons which usually already exist in most production environments, rather than implementing this part ourselves. This way, we only need to design a mechanism that adjusts the scheduling weight of the current node based on metrics.

Your suggestion is very good. But we believe that. Firstly, the node state data (such as real-time bandwidth) that our method relies on requires strong real-time performance, which may not be achievable if collected using OpenTelemetry. Secondly, our approach is the basic functionality of dragonfly, and using OpenTelemetry may lead to excessive dependency issues in the project.

baowj-678 avatar Dec 25 '24 12:12 baowj-678

Test loaclly

I set up a Dragonfly cluster locally using Docker to test the ParentSelector feature. I activate a seed peer and a peer as parents (limit bandwidth to 100mbps using tc). And start iperf3 in the seed peer container to simulate the situation where bandwidth is occupied. Afterwards, I launched a local peer and used dfget to conduct file download tests through it.

The difference is that: Enable ParentSelector: download.parentSelector.enable=true; Disable ParentSelector: download.parentSelector.enable=false;

Settings

Peers:

A Seed Peer (running iperf3, as parent) A Peer (as parent) Local Peer (running dfget to test)

Target File:

Name: random_file (generated by dd if=/dev/urandom) Size: 1GB

Result

Enable ParentSelector: 76s

Disable ParentSelector: 112s

Video links

Enable ParentSelector: https://pan.baidu.com/s/1NExIVdwI2O8lbmyPsoy4aQ?pwd=mw8p Disable ParentSelector: https://pan.baidu.com/s/14FzaBLcK1CwSrUA5n32egg?pwd=y6ej

baowj-678 avatar Jan 21 '25 08:01 baowj-678

Test

SouthWest7 avatar Nov 21 '25 06:11 SouthWest7