trezor-firmware icon indicating copy to clipboard operation
trezor-firmware copied to clipboard

SignTx should contain the signed definition of the used coin/token

Open brianddk opened this issue 5 years ago • 15 comments

If I'm reading this right, the CoinInfo object is being built from a static table defined a compile-time.

What are the thoughts of allowing the CoinInfo object to be passed in (marshalled) as a protobuf to SignTx. This would allow new bitcoin forks or those testing new coins to self-add trezor support so long as the deviation from bitcoin signing protocols could be contained in the prevailing coins.json structures.

Likely some gotcha's that I haven't thought of, but I'm curious as to your take.

brianddk avatar Dec 12 '18 07:12 brianddk

We were already thinking about this, there is already some tooling in the trezor-common repo (where I will move the issue).

prusnak avatar Dec 12 '18 10:12 prusnak

Some details:

this could apply to Bitcoin-like coin definitions (apps.common.coininfo), Ethereum network definitions (apps.ethereum.networks), ERC-20 tokens (apps.ethereum.tokens). Out of those, Bitcoin-like are the most complex structure, whereas networks and tokens are just a couple data points.

Definitions passed from the host must be signed by SatoshiLabs, otherwise malware could claim that you are signing transfer of low-value token, while in fact sending a high-value token.

Draft of an idea:

  • create a protobuf message that copies the structure of CoinInfo / NetworkInfo / TokenInfo. This structure must contain cross-validating information, so e.g. in case of an ERC20 token, the chain_id and token contract address must be included.
  • pass this message alongside the corresponding command (GetAddress, SignTx, EthereumSignTx) somehow.
    • use it as a submessage directly -- but then Trezor would need to re-serialize it in order to validate the signature
    • send it as a bytes field and the signature as another bytes field. Trezor verifies the signature before attempting to decode protobuf.
    • have Trezor ask for the definition + signature in something like SignedTokenInfoRequest
  • check that the provided definition is (a) well signed, (b) internally correct, (c) matches the current request

The "main" coins should still be hard-coded in firmware. We definitely want to keep definitions for Bitcoin and its testnets, Ethereum and its testnets. We might want to be able to keep some definitions internally, e.g., to make a list of top 10 ERC20 tokens or something. If a request matches an internal definition, it must never use the externally provided one.

matejcik avatar May 20 '22 11:05 matejcik

  • pass this message alongside the corresponding command [...]

Another option is to have a flow that is similar to DoPreauthorized workflow. The host starts by sending a SignedTokenInfo message. Trezor checks the signature and if it's valid, then it temporarily adds the token info to the internal list of tokes. Trezor then responds with a SignedTokenInfoResponse message indicating that the signature is valid. The host then sends the actual command that uses the token (SignTx, GetAddress, etc.). After the command finishes the token is removed from the internal list. I can see some problems with this solution, like what if we need to combine several of these workflow altering commands (DoPreauthorized + SignedTokenInfo or multiple SignedTokenInfo)? That might get messy, so I am not very confident about this solution.

An alternative to this is that SignedTokenInfo would add the token info to the internal list temporarily until the next power-cycle, so not only for the next command but for all subsequent commands. This solution would require preallocating some area of memory which would store the data between workflows, similar to the session cache.

andrewkozlik avatar Jun 15 '22 12:06 andrewkozlik

We also need to consider the problem of TokenInfo revocation. For example in the future we may decide to delist a particular token, to decrease the maxfee_kb significantly or we simply need to correct a mistake in the coin definition. The obvious solution is to generate a new signing key, do a firmware update and reissue new signatures for all the coin definitions. That means Suite (and 3rd party software) need to maintain a list of signed coin definitions for each firmware version or request the set of signed coin definitions from our server.

I can think of a key update mechanism which would work without the need to update firmware. That way Suite could just maintain a list of the latest coin definitions. Such a key update system would be useful to have in other places as well.

andrewkozlik avatar Jun 15 '22 13:06 andrewkozlik

Let's make sure this is something we absolutely need, so we don't generate the effort just for the sake of being able to generating the effort. So far I don't see a strong reason for this for apps.common.coininfo or apps.ethereum.networks but it might make sense for apps.ethereum.tokens.

prusnak avatar Jun 15 '22 13:06 prusnak

If we only need to revoke the definition (and not the signing key), we can add a version counter to the TokenInfo, and have firmware updates bump the required minimum version. That way it is possible for old firmwares to use new definitions (unless under attack) and for Suite to maintain only one set of signed definitions.

matejcik avatar Jun 15 '22 13:06 matejcik

Let's make sure this is something we absolutely need

This is mainly part of the effort to free up flash space. tokens take up 77 kB, networks and coin_info have ~10kB each, which is not insignificant. Also tokens and networks are updated relatively frequently so being able to update without updating firmware is nice-to-have.

We might leave out coin_info in the end, just to make the attack surface smaller. In any case coin_info should be signed offline and this mechanism would be only used to offload the storage.

matejcik avatar Jun 15 '22 13:06 matejcik

The plan so far looks like this:

Firmware side

  1. All Bitcoin (and Bitcoin-like) coin definitions are staying in FW without changes.
  2. Ethereum networks and tokens are going to be cut out. Only some of the basic ones will stay (needs further discussion which ones are staying).
  3. Ethereum protobuf messages, that contain network(chain_id) field, will be extended by one optional field, something like definition (I need to find a better name for it).
  4. definition will be an embedded message:
    • containing 2 fields:
      1. optional bytes network_definition
      2. optional bytes token_definition
    • each field composed of:
      1. prefix:
        1. format version of the definition (UTF-8 string trzd + version number, padded with zeroes if shorter, 5 bytes)
        2. type of data (unsigned integer, 1 byte)
        3. data version of the definition (unsigned integer, 4 bytes)
        4. length of the encoded protobuf message - payload length in bytes (unsigned integer, 2 bytes)
      2. payload: serialized form of protobuf message EthereumNetworkInfo or EthereumTokenInfo (N bytes)
      3. suffix:
        1. length of the Merkle tree proof - number of hashes in the proof (unsigned integer, 1 byte)
        2. proof - individual hashes used to compute Merkle tree root hash (plain bits, N*32 bytes)
        3. signed Merkle tree root hash (plain bits, 64 bytes)
  5. Trezor will first check the format version, type of the data and data version (prefix), then the signature (suffix) and at the end it will deserialize the definitions themselves.
  6. Tests will use some pre-generated file with definitions, commited in git repo and for testing purposes there will be "debug" keys included.
  7. We will use versioning of the coin definitions to be able to restrict FW from using some older definitions (in case that we need to modify definitions somehow and we want FW to use at least this modified version or newer). As a data version I would probably use unix timestamp for the sake of simplicity - no need to remember last version number when generating new definitions, plus we have an info of when it was generated. For format version we can simply use some integer counter.

Backend

  1. Signed definitions will be generated by our script.
  2. As a data source we will use coingecko API (we need to consult the best way how to do it with shopsys) and networks and tokens repositories (as before).
  3. Generated data will be stored as a binary data in the following file structure:
definitions-latest/
├── by_chain_id/
│   ├── "CHAIN_ID"/
│   │   ├── network.dat
│   │   ├── token_"TOKEN_ID".dat
│   │   ├── token_"TOKEN_ID".dat
│   │   ...
│   ├── "CHAIN_ID"/
│   │   ├── network.dat
│   │   ├── token_"TOKEN_ID".dat
│   │   ├── token_"TOKEN_ID".dat
│   │   ...
│   ...
├── by_slip44/
    ├── "SLIP44_ID"/
    │   └── network.dat
    ├── "SLIP44_ID"/
    │   └── network.dat
    ...

Clients

  1. Connect has to implement this functionality on their side to provide this functionality transparently.
  2. QA has to test this modified protobuf messages workflow with Ethereum wallets - especially those who don't use Connect.

marnova avatar Jun 21 '22 09:06 marnova

definitions_"VERSION"/

At the very least, we need definitions-latest. More generally, I'm not convinced that we want to be publishing old versions of the definitions at all. Old firmwares should be able to use new definitions -- the version is "version of data", not "version of format", which we should document explicitly somehow. And we do not want to be keeping old data versions around.

matejcik avatar Jun 21 '22 09:06 matejcik

I agree, we don't need to keep older versions, the "VERSION" suffix was just for reference of what version it is. But we don't need that either.

I will edit my comment to reflect your notes.

marnova avatar Jun 21 '22 09:06 marnova

required bytes network_definition

I think this should also be optional

The version number denotes the "version of data", not the "version of format".

Maybe we want both? So the prefix would be "magic" (version of the format), "type", "version" (version of the data)

prusnak avatar Jun 21 '22 10:06 prusnak

I think this should also be optional

Good point. I originally thought that it makes no sense to omit network_definition while providing token_definition -- but it does make sense if the network is Ethereum which will be built-in.

matejcik avatar Jun 21 '22 10:06 matejcik

Maybe we want both? So the prefix would be "magic" (version of the format), "type", "version" (version of the data)

Ok, could be useful if we would need to change the format somehow in the future.

marnova avatar Jun 21 '22 10:06 marnova

TODO: We also need to formulate a response plan in case the signing key gets compromised.

andrewkozlik avatar Jun 21 '22 11:06 andrewkozlik

  • [x] define format for definitions
  • [x] modify protobufs
  • [x] trezorctl works with signed definitions
  • [x] determine which defs will stay built-in
  • [x] update built-in defs
  • [x] update existing and add new tests

FW (TT)

  • [x] parse and verify received definitions
  • [x] modify FW code to use signed definitions
  • [x] modify release scripts
  • [x] update tests

FW (T1)

  • [x] parse and verify received definitions
  • [x] update handlers code for all Ethereum messages

Definitions

  • [x] discuss CoinGecko API with ShopSys
  • [x] find out which sources we can use to generate defs
  • [x] generate definitions
  • [x] check definitions for changes
  • [x] encode and sign definitions
  • [x] use Merkle tree
  • [x] generate binary definitions also for built-in definitions
  • [x] decrease the size of the encoded definition blobs
  • [x] definitions in zip
  • [x] zip tools, changes in tests
  • [ ] make the definitions available on data.trezor.io

Connect

  • [x] issue to make changes on their side - https://github.com/trezor/trezor-suite/issues/6442
  • [x] issue to MetaMask - https://community.metamask.io/t/trezor-breaking-change-ethereum-definitions-networks-and-tokens/23622

marnova avatar Sep 08 '22 12:09 marnova

We've found out, that we have a size issue with the generated definitions. Currently all the generated definitions together have around 61MB, zipped it's 13MB. We will try to downsize them.

marnova avatar Oct 19 '22 11:10 marnova