Dojo storage layout
Currently, the Dojo storage layout can be seen as serial. The fields of a model struct are stored in a consecutive order, with data being packed a bit level.
struct MyModel {
a: u32,
b: u32,
c: felt252,
}
// Serialized:
[a, b, c]
// Packed
[packed(ab), packed(c)]
This layout (let's call it serial) has the benefit to be very compact, and can easily support the "one" felt storage model.
However, the downsides are:
- Not resistant to upgrade (fields order, field removal), it's an append only model.
- Dynamic types like arrays are forced to be "fixed", which induce other constraints.
On Starknet, the storage uses the selector of each field name to ensure each field starts at a different slot in the storage.
The proposal here is to add an other layout to the dojo database: hash.
By leveraging the hashing function, we could replicate how Starknet storage works, without compromising the composability of Dojo database.
#[dojo::model(layout: 'hash')]
struct MyModel {
a: u32,
b: ByteArray,
c: Array<felt252>,
}
// Serialized:
[a, b, c]
// Packed, but each one is packed individually.
[packed(a), packed(b), packed(c)]
Doing so, we gain two major things:
- Upgradeability resiliency (fields can be removed, re-ordered).
- Dynamic types are now supported. The downside of this approach is that we pay extra hashing for every single field, regardless of it's size.
The proposal here is to have both layout available in Dojo. The default layout would be switched to hash, to ensure a smooth experience with all dynamic types. And the user can then optimize the storage by switching to serial layout.
Implications:
This change will impact mostly dojo-core, dojo-lang and torii.
-
dojo-core: The world API will be updated to support those two layouts. In the case ofhashlayout, the selector of each fields will be passed to instruct in which order the fields must be gathered. And then packing layout is also something we can still support. -
dojo-lang: The macro forget!andset!must now distinguish the model layout. Based on this, we can generate the appropriate implementation to retrieve the layout and values from the model data. -
torii: When thehashlayout is used, I'm not sure if Torii needs to be modified to support it as we already rely on serialization. However, theByteArray (string)andArray<T>must be supported for indexation. We may start with only array of primitive types though.
To remind also the longer term vision, we also want to enable a user to update/retrieve only certain fields of a model. Both layout should support that.
Questions
- Should we use
hashlayout by default? - Should we allow packing on / packing off (independently of the layout)?
- Is this something missing in the Storage that you guys at higher level / client side feel we're missing?
Happy to have any feedback on that, to ensure we can provide to user a very smooth and configurable experience when it comes to storage and upgradeability.
I don't think i understand this:
Dynamic types are now supported.
But I think my question might be useful for clarification anyway.
Let's say we have an array of this type
#[dojo::model(layout: 'hash')]
struct SimpleModel {
a: u32,
b: u8
}
let arr: Array<SimpleModel> = [ (10,11), (20,21), (30, 31) ] //Simplified notation but u get the point
How is the arr stored using the hash layout?
I don't think i understand this:
Dynamic types are now supported.
But I think my question might be useful for clarification anyway.
Let's say we have an array of this type
#[dojo::model(layout: 'hash')] struct SimpleModel { a: u32, b: u8 } let arr: Array<SimpleModel> = [ (10,11), (20,21), (30, 31) ] //Simplified notation but u get the pointHow is the
arrstored using thehashlayout?
Dynamic types inside the model, because we store the model in the database. So it would rather be:
#[dojo::model(layout: 'hash')]
struct SimpleModel {
a: u32,
b: Array<u8>,
}
And in the storage with hash layout you will have:
-
selector("a")address (derived from model base address) -> [u32] (1 felt) -
selector("b")address (derived from model base address) -> [size of array, elts...] (at least 1 felt, but can be more).
For the first iteration, we will not support that, only primitive types:
#[dojo::model(layout: 'hash')]
struct NestedModel {
a: u32,
b: Array<SimpleModel>,
}
+1 for hash being default since arrays / bytearrays would be a common primitives. But I wonder how important is it to still support serial layout? If developers are looking to optimize for cost they could implement their own pack/unpack on single fields within the model? Much like what @notV4l is doing for RYO
+1 for hash being default since arrays / bytearrays would be a common primitives. But I wonder how important is it to still support
seriallayout? If developers are looking to optimize for cost they could implement their own pack/unpack on single fields within the model? Much like what @notV4l is doing for RYO
Good point. But isn't then your model like:
struct MyOneFelt {
f: felt252,
}
And you have to do the pack/unpack yourself + the access to the fields?
When with the serial layout you can:
#[dojo::model(layout: 'serial')]
struct MyData {
hp: u8,
potions: u8,
...
}
// --> 1 felt in storage.
and this will be packed automatically by Dojo, which allows you to keep your game logic using the model naturally?
But as you mentioned, is it really an added value for Dojo to support that? We've to check how much it impacts the stack to have both. :+1:
Iff you decide you want to support both layouts, wouldn't it be better to abstract the model serialization and de-serialization?
The benefit would be that a particular mechanic could be derived or custom implemented, much like serde.
Good point. But isn't then your model like:
struct MyOneFelt { f: felt252, }And you have to do the pack/unpack yourself + the access to the fields?
If this serde-like way is doable (I'm not familiar with dojo low-level to the sufficient extent) then you would only need pack/unpack. The model interface (fields) would remain the same.
It might be a two bladed sword thou, since when the implementation is written by someone else the burden of keeping the model upgradable is on the implementer. So added flexibility but also more room for errors.
I don't know if it makes sense but just a point to consider :)
I'm on it !
Reworked for 0.7.0 and new type support.