move icon indicating copy to clipboard operation
move copied to clipboard

[Accepted feature] native `u16`, `u32`, and `u256` types

Open sblackshear opened this issue 2 years ago • 2 comments

Rationale

  • The type u256 is important for interoperability with EVM chains. For example, NFTs are usually represented in 256 bits. The type is also important for Move on EVM. To have an idiomatic experience of using Move to create EVM contracts, u256 is essential. Moreover, u256 has the property to be large enough to be able to represent a safe hash for any representable value, which makes it a useful primitive for design patterns based on cryptographic concepts.
  • For the other types: Move has u8 and u64, but is missing u16 and u32. It makes sense to fill in these gaps, especially given that BCS uses u32's in some areas (e.g., vector indexes) where we also probably want u32's in Move

Design

Natively support the new integer types. All operations and notations available for the existing integer types (equality, comparison, casts, etc.) should be supported.

For u256 specifically, we will need to figure out how to represent the type in serialized form since it is not supported by BCS, Rust, or serde-reflection. Extending BCS is one option, but there may be easier choices.

Challenges

  1. We need to add support for u256 to BCS. Since the original bcs repo is under the original Diem organization, we may need to fork it.
  2. This will likely require a new file format version. We need to check whether we have other breaking changes coming and decide how to coordinate those.
  3. The addition of the new integer types will break many existing core and vm APIs. We need to assess the impacts of this on clients and come up with a recommended upgrade protocol for them.

Alternatives considered

  • Move frameworks that want u256 for EVM compatibility or other reasons can implement it in pure Move (with or without native function support), as Starcoin has done here. We could also consider adding this to the Move stdlib and transparently compiling +, - etc on u256 to use this module. This would keep the rather EVM-specific u256 out of the platform-agnostic bytecode language.

  • Support u256, but do not add u16 and u32 at this time

sblackshear avatar Sep 08 '22 02:09 sblackshear

Perhaps we should add some section about "Challenges" to the template. Here it appears to be that a local change in the Move repo may work out fine, but then break many downstream projects. We need to find some model to deal with this...

wrwg avatar Sep 08 '22 05:09 wrwg

@wrwg @sblackshear I added a challenge section for this proposal. Feel free to add more if you spot anything something

vgao1996 avatar Sep 12 '22 18:09 vgao1996

Spoke to @wrwg who said increasing the size of ValueImpl enum is a concern.

oxade avatar Sep 29 '22 15:09 oxade

We need to add support for u256 to BCS. Since the original bcs repo is under the original Diem organization, we may need to fork it.

I don't think we need to fork here, and the reasoning is the same for why we did not need to fork BCS to add the address type (which is a native Move type that is very much first-class, but is not a BCS type).

Conveying some additional BCS knowledge from @bmwill

Serde works by mapping types that you want to ser/de into the "serde object model" and then having Serializers and Deserializers that use that object model and translate it into a particular format (toml, bcs, bincode, json, ect)

  • primitive integers (x8, x16, x32, x64, x128)
  • bool
  • structs
  • tuples (sequences of non-uniform types)
  • Sequences of uniform types
  • Maps

with serde, a particular type T impls Serialize and Deserialize by mapping itself to that object model

when implementing Serialize and Deserialize you do not know which format is being used. This lets a type "describe itself" and allow for it to be ser/de to a variety of data formats

u256 is not a primitive data type in Rust, and is not a type in the serde object model. So you cannot just "add" u256 to the bcs Serializer and Deserializer without ceasing to also use the serde framework as a whole.

Since u256 is not a primitive type we'll have to define a type for that, call it U256 . that type will need to be defined somewhere and have various functions and traits implemented on it (eg Add, Sub, ect). From there all we need to do is to implement Serialize and Deserialize for U256 and decide how to map it into the Serde Object model.

</end @bmwill knowledge, back to Sam speaking>

This approach is (I believe) exactly what we've done for address, and it worked quite well there.

sblackshear avatar Oct 01 '22 16:10 sblackshear

Thanks @sblackshear ++ On the topic of increasing the size of ValueImpl enum, this is low concern now as u256 is ~ 32bytes, while ValueImpl is about 40bytes (due to RCs?). So size shouldn't change much.

oxade avatar Oct 03 '22 14:10 oxade

Yeah let's probably not worry about the size of the ValueImpl right now. u256 won't make it bigger.

vgao1996 avatar Oct 03 '22 16:10 vgao1996

Wow. Why the heck is the size so large? It's in fact 48 bytes!

#[test]
fn sizes() {
    let size = mem::size_of::<ValueImpl>();
    assert_eq!(size, 48);
    let size = mem::size_of::<Container>();
    assert_eq!(size, 16);
    let size = mem::size_of::<IndexedRef>();
    assert_eq!(size, 40);
    let size = mem::size_of::<ContainerRef>();
    assert_eq!(size, 32);
}

The largest single item is currently an IndexedRef with 40 bytes. I'm pretty sure this can be reduced down to 32, so the enum ends with 40. But I generally worry that our value size is too large for modern processor architectures. 128 bit is probably fine because it can be mapped to two registers, but 256 is out of scope.

wrwg avatar Oct 04 '22 00:10 wrwg

Is this a good beginner issue?

AlbertSu123 avatar Oct 04 '22 21:10 AlbertSu123

@AlbertSu123 I believe that @oxade is currently working on implementing this feature.

bmwill avatar Oct 04 '22 23:10 bmwill

PR is ready here: https://github.com/move-language/move/pull/547

oxade avatar Oct 18 '22 01:10 oxade

Merged Additional minor item is syntax highlighting for the analyzer extension but this is not a blocker https://github.com/damirka/vscode-move-syntax/pull/4

oxade avatar Oct 27 '22 01:10 oxade