nimbus-eth1 icon indicating copy to clipboard operation
nimbus-eth1 copied to clipboard

RLPx out of range crash triggered by message in auth handshake

Open jlokier opened this issue 2 years ago • 3 comments

There is a decoding vulnerability to untrusted network data in the RLPx auth code.

While connected to Ropsten test network, Nimbus-eth1 raised this error in rlpxAccess/decodeAuthMessageEip8 and took down the process.

It would be appropriate to go through and systematically check for similar length trust issues in all input in RLPx.

The Geth version shown is interesting. The hex part of clientId is not a valid commit in the Geth repo. Is it a branched old version of Geth, in which case perhaps it's correct protocol we are incorrectly decoding? Or is it a fake version, intentionally sending out adversarial messages?

(Ignore the eth/65 mention, this is running a small patch that doesn't touch any relevant code.

TRC 2021-07-22 22:14:23.044+01:00 Received Hello                             topics="rlpx" tid=909268 file=rlpx.nim:1208 version=5 id=Geth/v1.0.0-stable-fd84c648/linux-amd64/go1.13.4
TRC 2021-07-22 22:14:23.044+01:00 devp2p handshake completed                 topics="rlpx" tid=909268 file=rlpx.nim:1219 peer=Node[18.167.87.121:49934] clientId=Geth/v1.0.0-stable-fd84c648/linux-amd64/go1.13.4
TRC 2021-07-22 22:14:23.044+01:00 DevP2P local and remote protocols          topics="rlpx" tid=909268 file=rlpx.nim:209 local=[eth/65] remote="[eth/63, eth/64, eth/65]"
TRC 2021-07-22 22:14:23.045+01:00 >> Sending eth.Status (0x00) [eth/65]      tid=909268 file=eth65.nim:49 peer=Node[18.167.87.121:49934] td=8618164 bestHash=ecd382aade186e4fea76c3a56e25563a1f7515a724f93e49a4b904bc4649a917 networkId=3 genesis=41941023680923e0fe4d74a34bdac8141f2540e3ae90623718e47d66d1ca4a2d forkHash=63760190 forkNext=1700000
/home/jamie/Status/nimbus-eth1/nimbus/nimbus.nim(226) nimbus
/home/jamie/Status/nimbus-eth1/nimbus/nimbus.nim(188) process
/home/jamie/Status/nimbus-eth1/vendor/nim-chronos/chronos/asyncloop.nim(279) poll
/home/jamie/Status/nimbus-eth1/vendor/nim-chronos/chronos/asyncmacro2.nim(74) colonanonymous
/home/jamie/Status/nimbus-eth1/vendor/nim-eth/eth/p2p/rlpx.nim(1170) rlpxAccept
/home/jamie/Status/nimbus-eth1/vendor/nim-eth/eth/p2p/auth.nim(469) decodeAuthMessage
/home/jamie/Status/nimbus-eth1/vendor/nim-eth/eth/p2p/auth.nim(370) decodeAuthMessageEip8
/home/jamie/Status/nimbus-eth1/vendor/nimbus-build-system/vendor/Nim/lib/system/chcks.nim(32) raiseRangeErrorI
/home/jamie/Status/nimbus-eth1/vendor/nimbus-build-system/vendor/Nim/lib/system/fatal.nim(49) sysFatal
[[reraised from:
/home/jamie/Status/nimbus-eth1/nimbus/nimbus.nim(226) nimbus
/home/jamie/Status/nimbus-eth1/nimbus/nimbus.nim(188) process
/home/jamie/Status/nimbus-eth1/vendor/nim-chronos/chronos/asyncloop.nim(279) poll
/home/jamie/Status/nimbus-eth1/vendor/nim-chronos/chronos/asyncmacro2.nim(74) colonanonymous
/home/jamie/Status/nimbus-eth1/vendor/nim-chronos/chronos/asyncmacro2.nim(252) rlpxAccept
]]
[[reraised from:
/home/jamie/Status/nimbus-eth1/nimbus/nimbus.nim(226) nimbus
/home/jamie/Status/nimbus-eth1/nimbus/nimbus.nim(188) process
/home/jamie/Status/nimbus-eth1/vendor/nim-chronos/chronos/asyncloop.nim(279) poll
/home/jamie/Status/nimbus-eth1/vendor/nim-chronos/chronos/asyncmacro2.nim(96) colonanonymous
]]
Error: unhandled exception: value out of range: -21 notin 0 .. 9223372036854775807 [RangeError]

jlokier avatar Jul 23 '21 18:07 jlokier

Can confirm thal on my Goerli block collection test. Reproducible event:

/home/jordan/nimbus-eth1/nimbus/nimbus.nim(264) nimbus /home/jordan/nimbus-eth1/nimbus/nimbus.nim(226) process /home/jordan/nimbus-eth1/vendor/nim-chronos/chronos/asyncloop.nim(279) poll /home/jordan/nimbus-eth1/vendor/nim-chronos/chronos/asyncmacro2.nim(74) colonanonymous /home/jordan/nimbus-eth1/vendor/nim-eth/eth/p2p/rlpx.nim(619) dispatchMessages /home/jordan/nimbus-eth1/vendor/nimbus-build-system/vendor/Nim/lib/system/chcks.nim(23) raiseIndexError2 /home/jordan/nimbus-eth1/vendor/nimbus-build-system/vendor/Nim/lib/system/fatal.nim(49) sysFatal [[reraised from: /home/jordan/nimbus-eth1/nimbus/nimbus.nim(264) nimbus /home/jordan/nimbus-eth1/nimbus/nimbus.nim(226) process /home/jordan/nimbus-eth1/vendor/nim-chronos/chronos/asyncloop.nim(279) poll /home/jordan/nimbus-eth1/vendor/nim-chronos/chronos/asyncmacro2.nim(96) colonanonymous ]] Error: unhandled exception: index 33 not in 0 .. 32 [IndexError]

with always the same index bat varying time until the error occurs (best run was to collect 500k blocks which was enough for my replay.)

mjfh avatar Jul 23 '21 21:07 mjfh

Can confirm

No, it's not the same issue. What you've reported there is a different, unrelated bug also in RLPx.

Look at the exception: It's an IndexError rather than a RangeError, and it's raised in a different area of the code, which is shown in the stacktrace: rlpx.nim(619) dispatchMessages versus rlpx.nim(1170) rlpxAccept.

These stacktrace may be confusing if you're not used to Chronos. It's good to ignore the [[reraised from:...]] section. It's just an artifact of how exceptions are processed in a proc with pragma {.async.}.

It's a useful report though, thanks, so I've filed it as issue #767.

jlokier avatar Jul 24 '21 02:07 jlokier

Looks like a check is needed on the input versus that eciesOverheadLength. This is used on some other locations too, and yes, in general the code will need systematically checking on input sizes and such. I don't think in the past much attention has gone to this in the eth1 code.

kdeme avatar Jul 26 '21 07:07 kdeme