nim-libp2p icon indicating copy to clipboard operation
nim-libp2p copied to clipboard

[SEC] libp2p - terminate connection if securedHandler fails

Open tintinweb opened this issue 4 years ago • 5 comments

Description

Performing an invalid handshake hinting a zero-length noise packet results in the connection to hang. The connection is not closed and there does not seem to be a timeout.

It seems like failing to handshake /noise is treated as a recoverable error while it should be fatal. see line 371 below.

⇒  python eth2node.py
=> '\x13/multistream/1.0.0\n\x07/noise\n'
<= '\x13/multistream/1.0.0\n'

=> '\x00\x00'
<= '\x07/noise\n'
/noise

=> '\x13/multistream/1.0.0\n'
=> '\x07/noise\n'
=> '\x03ls\n'

⇒  build/beacon_node --network=altona --log-level="WARN; TRACE:networking" --data-dir=build/data/shared_altona_0 --dump --web3-url=wss://goerli.infura.io/ws/v3/6224f3c792cc443fafb64e70a98f871e --tcp-port=9000 --udp-port=9000 --metrics --metrics-port=8008 --rpc --rpc-port=9190 --help

...

TRC 2020-07-22 13:27:12+02:00 upgrading incoming connection              topics="switch" tid=9594725 file=switch.nim:345 conn= oid=5f182290641ecd6028e91e73
TRC 2020-07-22 13:27:12+02:00 initiating handshake                       topics="multistream" tid=9594725 file=multistream.nim:48 codec="\x13/multistream/1.0.0\n"
TRC 2020-07-22 13:27:12+02:00 registering proto handler                  topics="switch" tid=9594725 file=multistream.nim:179 codec=/noise
TRC 2020-07-22 13:27:12+02:00 handle: starting multistream handling      topics="multistream" tid=9594725 file=multistream.nim:115
TRC 2020-07-22 13:27:12+02:00 handle: got request for                    topics="multistream" tid=9594725 file=multistream.nim:121 ms=/noise
TRC 2020-07-22 13:27:12+02:00 found handler for                          topics="multistream" tid=9594725 file=multistream.nim:143 protocol=/noise
TRC 2020-07-22 13:27:12+02:00 Securing connection                        topics="switch" tid=9594725 file=switch.nim:354 oid=5f182290641ecd6028e91e73
TRC 2020-07-22 13:27:12+02:00 Starting Noise handshake                   topics="noise" tid=9594725 file=noise.nim:424 initiator=false peer=
TRC 2020-07-22 13:27:12+02:00 mixHash                                    topics="noise" tid=9594725 file=noise.nim:161 hash=f3d15e6108ed...2e0958dc002d
TRC 2020-07-22 13:27:12+02:00 receiveHSMessage                           topics="noise" tid=9594725 file=noise.nim:270 size=0
TRC 2020-07-22 13:27:12+02:00 noise read e                               topics="noise" tid=9594725 file=noise.nim:237 size=0
DBG 2020-07-22 13:27:12+02:00 ending secured handler                     topics="switch" tid=9594725 file=switch.nim:375 err="Noise E, expected more data"
TRC 2020-07-22 13:27:12+02:00 leaving multistream loop                   topics="multistream" tid=9594725 file=multistream.nim:156

Exploit Scenario

Failing to upgrade to a secure connection leaves the connection unencrypted. In our tests, the connection appears to hang while it might still be in another select loop. The risk is that a connection might be continued without ensuring that transport security was properly initialized and there is risk that nodes might use the hanging connection for resource exhaustion attacks.

Note: the muxer is only attached if the security protocol returns success. Additional commands will not be available until a secured transport was established.

Mitigation Recommendation

  • terminate the connection on protocol errors. consider adjusting the clients score on foul play.

Refrences

https://github.com/status-im/nim-libp2p/blob/b8b0a2b4bce8448b3aabc67fda96534124073c1c/libp2p/switch.nim#L345-L379

tintinweb avatar Jul 22 '20 13:07 tintinweb

@dryajov I need your help with this one actually since you wrote most of the code. I cannot see how this is happening. The above linked return will not happen actually, what happens is an exception, the exception will be handled (inside switch.nim / secureHandler), and that flow is done... the previous multi stream flow is also done (return in ms handler) after the secureHandler exits..., and this will go back into switch->start->handler which has a finally clause with conn.close() Am I missing something?

Would help to have the test scenario scripts.

sinkingsugar avatar Jul 27 '20 06:07 sinkingsugar

The above log shows what I wrote above btw. One point I don't get is also how to trigger this:

Failing to upgrade to a secure connection leaves the connection unencrypted. In our tests, the connection appears to hang while it might still be in another select loop

sinkingsugar avatar Jul 27 '20 06:07 sinkingsugar

I cannot see how this is happening...

@sinkingsugar yes, I believe you are correct. It is unfortunate however that exceptions make this flow a lot more complicated than what it needs be.

@tintinweb To make reasoning about the flow a bit easier, the rule of thumb is to follow the exception flow to see if a handler higher up will do the required cleanup/closing/exit/etc... Ideally, this would have been enforced through the {.raises: [...].} pragma, which enables checked exceptions in Nim, but it currently doesn't work with our async subsystem (chronos).

The lack of support for checked exceptions with chronos might in fact be a security issues in and out of itself - something to consider perhaps.

dryajov avatar Jul 27 '20 16:07 dryajov

Update: This issue was kinda invalid back then but the error handling behavior has also changed in the meantime, for the better (altho not final) https://github.com/status-im/nim-libp2p/blob/96ed9fbe85de18bbb13dfe8cf16e79fe9c5c0906/libp2p/protocols/secure/secure.nim#L133-L140 During handshakes now we completely disconnect if something wrong happens. This applies also to the deprecated Secio as it's on the base class.

sinkingsugar avatar Nov 11 '20 08:11 sinkingsugar

This has been addressed but not properly reviewed by auditors. I'm leaving this around for a future round of audits.

dryajov avatar Jan 26 '21 15:01 dryajov

I assume this will never be reviewed by auditors

Menduist avatar Oct 12 '23 09:10 Menduist