eth.rb icon indicating copy to clipboard operation
eth.rb copied to clipboard

Should transaction decoding permit leading `\x00`?

Open RogerPodacter opened this issue 1 year ago • 2 comments

Currently Eth::Tx::Eip1559.decode has this: value = Util.deserialize_big_endian_to_int tx[6]

And deserialize_big_endian_to_int strips leading \x00 from the string before converting to an integer

def deserialize_big_endian_to_int(str)
  Rlp::Sedes.big_endian_int.deserialize str.sub(/\A(\x00)+/, "")
end

This behavior means that when we RLP-decode the transaction we end up interpreting \x00 as 0 because:

Eth::Util.deserialize_big_endian_to_int("\x00") == 0

But I don't think this is correct because the RLP-encoding for 0 is the empty string, which Eth::Util.serialize_int_to_big_endian handles correctly:

Eth::Util.serialize_int_to_big_endian(0) == ''

This means we can't "round trip":

Eth::Util.serialize_int_to_big_endian(Eth::Util.deserialize_big_endian_to_int(bytes)) != bytes.

I don't think this is necessarily an issue with deserialize_big_endian_to_int but rather that RLP decoding should use a different method for converting integers. Something like:

def deserialize_rlp_int(hex_string)
  bytes = Eth::Util.hex_to_bin(hex_string)
  
  if bytes.starts_with?("\x00")
    raise InvalidRlpInt, "Invalid RLP integer #{hex_string}"
  end
  
  Eth::Util.deserialize_big_endian_to_int(bytes)
end

What do you think?

RogerPodacter avatar Oct 30 '24 13:10 RogerPodacter

Can you give me an example that fails for you so that I can build a test case to fix this issue?

q9f avatar Dec 17 '24 13:12 q9f

An example: when decoding a 1559 transaction you extract the priority fee here:

priority_fee = Util.deserialize_big_endian_to_int tx[2]

Suppose tx[2] == "\x00" (Or "\x00\x01")

Expected: This should raise an exception because \x00 is not a valid RLP encoding for any integer and therefore the transaction has been improperly encoded.

Actual: The code interprets \x00 as 0 and no exception is raised.

This is a problem wherever RLP is decoded and so ideally there would be a method (like deserialize_rlp_int described above) to fix this across the board.

RogerPodacter avatar Dec 17 '24 22:12 RogerPodacter