move
move copied to clipboard
[Accepted feature] native `u16`, `u32`, and `u256` types
Rationale
- The type u256 is important for interoperability with EVM chains. For example, NFTs are usually represented in 256 bits. The type is also important for Move on EVM. To have an idiomatic experience of using Move to create EVM contracts, u256 is essential. Moreover, u256 has the property to be large enough to be able to represent a safe hash for any representable value, which makes it a useful primitive for design patterns based on cryptographic concepts.
- For the other types: Move has
u8
andu64
, but is missingu16
andu32
. It makes sense to fill in these gaps, especially given that BCS usesu32
's in some areas (e.g., vector indexes) where we also probably wantu32
's in Move
Design
Natively support the new integer types. All operations and notations available for the existing integer types (equality, comparison, casts, etc.) should be supported.
For u256
specifically, we will need to figure out how to represent the type in serialized form since it is not supported by BCS, Rust, or serde-reflection
. Extending BCS is one option, but there may be easier choices.
Challenges
- We need to add support for u256 to BCS. Since the original bcs repo is under the original Diem organization, we may need to fork it.
- This will likely require a new file format version. We need to check whether we have other breaking changes coming and decide how to coordinate those.
- The addition of the new integer types will break many existing core and vm APIs. We need to assess the impacts of this on clients and come up with a recommended upgrade protocol for them.
Alternatives considered
-
Move frameworks that want u256 for EVM compatibility or other reasons can implement it in pure Move (with or without native function support), as Starcoin has done here. We could also consider adding this to the Move stdlib and transparently compiling +, - etc on u256 to use this module. This would keep the rather EVM-specific u256 out of the platform-agnostic bytecode language.
-
Support
u256
, but do not addu16
andu32
at this time
Perhaps we should add some section about "Challenges" to the template. Here it appears to be that a local change in the Move repo may work out fine, but then break many downstream projects. We need to find some model to deal with this...
@wrwg @sblackshear I added a challenge section for this proposal. Feel free to add more if you spot anything something
Spoke to @wrwg who said increasing the size of ValueImpl enum is a concern.
We need to add support for u256 to BCS. Since the original bcs repo is under the original Diem organization, we may need to fork it.
I don't think we need to fork here, and the reasoning is the same for why we did not need to fork BCS to add the address
type (which is a native Move type that is very much first-class, but is not a BCS type).
Conveying some additional BCS knowledge from @bmwill
Serde works by mapping types that you want to ser/de into the "serde object model" and then having Serializers and Deserializers that use that object model and translate it into a particular format (toml, bcs, bincode, json, ect)
- primitive integers (x8, x16, x32, x64, x128)
- bool
- structs
- tuples (sequences of non-uniform types)
- Sequences of uniform types
- Maps
with serde, a particular type T impls Serialize and Deserialize by mapping itself to that object model
when implementing Serialize and Deserialize you do not know which format is being used. This lets a type "describe itself" and allow for it to be ser/de to a variety of data formats
u256 is not a primitive data type in Rust, and is not a type in the serde object model. So you cannot just "add" u256 to the bcs Serializer and Deserializer without ceasing to also use the serde framework as a whole.
Since u256 is not a primitive type we'll have to define a type for that, call it U256
. that type will need to be defined somewhere and have various functions and traits implemented on it (eg Add, Sub, ect). From there all we need to do is to implement Serialize and Deserialize for U256 and decide how to map it into the Serde Object model.
</end @bmwill knowledge, back to Sam speaking>
This approach is (I believe) exactly what we've done for address
, and it worked quite well there.
Thanks @sblackshear ++ On the topic of increasing the size of ValueImpl enum, this is low concern now as u256 is ~ 32bytes, while ValueImpl is about 40bytes (due to RCs?). So size shouldn't change much.
Yeah let's probably not worry about the size of the ValueImpl right now. u256 won't make it bigger.
Wow. Why the heck is the size so large? It's in fact 48 bytes!
#[test]
fn sizes() {
let size = mem::size_of::<ValueImpl>();
assert_eq!(size, 48);
let size = mem::size_of::<Container>();
assert_eq!(size, 16);
let size = mem::size_of::<IndexedRef>();
assert_eq!(size, 40);
let size = mem::size_of::<ContainerRef>();
assert_eq!(size, 32);
}
The largest single item is currently an IndexedRef with 40 bytes. I'm pretty sure this can be reduced down to 32, so the enum ends with 40. But I generally worry that our value size is too large for modern processor architectures. 128 bit is probably fine because it can be mapped to two registers, but 256 is out of scope.
Is this a good beginner issue?
@AlbertSu123 I believe that @oxade is currently working on implementing this feature.
PR is ready here: https://github.com/move-language/move/pull/547
Merged Additional minor item is syntax highlighting for the analyzer extension but this is not a blocker https://github.com/damirka/vscode-move-syntax/pull/4