vyper icon indicating copy to clipboard operation
vyper copied to clipboard

hashes of strings are computed differently at compile time and runtime

Open ptrcarta opened this issue 3 years ago • 2 comments

Version Information

  • vyper Version (output of vyper --version): 0.3.6+commit.4a2124d0
  • OS: osx
  • Python Version (output of python --version): Python 3.9.13

Folded hashes are computed on utf8 encoded strings, runtime hashes are computed on latin1 encoded strings

# @version 0.3.6

@external
@view
def compile_hash() -> bytes32:
    return keccak256('è')

@external
@view
def runtime_hash() -> bytes32:
    str:String[1] = 'è'
    return keccak256(str)

Here runtime_hash() and compile_hash() should return the same bytes32 value, but they are different. one is the hash of 0xe8 (ord('è')), the other is the hash of 0xc3a8 ('è'.encode())

How can it be fixed?

Encode strings in Keccak256 folding the same way as they are represented at runtime

ptrcarta avatar Sep 12 '22 15:09 ptrcarta

the culprit is string_to_bytes, which does not do a correct utf-8 encoding. https://github.com/vyperlang/vyper/blob/be2b7f427bf980a0baf52cdd010d83231824ad3a/vyper/utils.py#L125-L133

charles-cooper avatar Sep 12 '22 15:09 charles-cooper

Note: same issue with the builtin sha256

trocher avatar Dec 06 '23 13:12 trocher