massa icon indicating copy to clipboard operation
massa copied to clipboard

Add a general timeout on bootstrap

Open AurelienFT opened this issue 2 years ago • 1 comments

Context

The server in the bootstrap wait for message from client he is connected to.

Problem

A client can ask to the server multiple data in loop to keep him alive. We should think of a way to avoid it. Multiple solutions I see :

  • Ban node when asking multiple times the same info in short amount of time ?
  • Ban node when cleint is connected for too long

AurelienFT avatar Jun 01 '22 13:06 AurelienFT

Nodes are already "banned" as soon as they successfully connect.

Therefore, I think just closing the connection after a max time is ok :)

We just need to compute how fast the expected connection speed downloads 1TB to set the default timeout

damip avatar Jul 11 '22 21:07 damip

This looks like a possible availability attack vector, no?

Have I found the right loop code here

It is called here, and the request is configured with this "struct"

From what I gather, that match is a response to a request for more data needed for a bootstrapping. a client can perform DoS attack.

(Taking the perspective of an attacker, and assuming at this point it will address the problem of an honest node that is just dealing with problems)

An attacker could modify their code such that it continuously sends an AskBootstrapPart to keep the connection alive. A malicious actor could flood the bootstrap servers, cutting off availability to honest nodes.

Am I in the right ball-park?

Ben-PH avatar Jan 09 '23 13:01 Ben-PH

Yes that's the problem and so the goal is to have a general bootstrap but we need to download up to 1TB of data (depending on the ledger size).

The goal is to detect someone that is looping versus someone with "reasonable" low connection. There is multiple way for example, compute number of new data sent each time to see if we are indeed making progress on bootstrap, setup a time relative to the size of our ledger right now

AurelienFT avatar Jan 09 '23 14:01 AurelienFT

Do we have any "institutional knowledge" on how to choose a threshold, such as existing tools, experience from watching the testnet, etc?

Ben-PH avatar Jan 10 '23 11:01 Ben-PH

Do we have any "institutional knowledge" on how to choose a threshold, such as existing tools, experience from watching the testnet, etc?

Sadly, not at all, we are experiencing some instability in bootstrap time right now on our nodes. I think you can start by doing some research on that, to enlighten us on "what's possible"

AurelienFT avatar Jan 10 '23 11:01 AurelienFT

My thinking at this point is to look at the logs of our bootstrap servers, put together something that filters the relevant lines, and start putting together some statistics.

Ben-PH avatar Jan 11 '23 13:01 Ben-PH

#3431

Ben-PH avatar Jan 16 '23 13:01 Ben-PH