Currently, the Dojo storage layout can be seen as serial. The fields of a model struct are stored in a consecutive order, with data being packed a bit level.

struct MyModel {
    a: u32,
    b: u32,
    c: felt252,
}

// Serialized:
[a, b, c]
// Packed
[packed(ab), packed(c)]

This layout (let's call it serial) has the benefit to be very compact, and can easily support the "one" felt storage model. However, the downsides are:

Not resistant to upgrade (fields order, field removal), it's an append only model.
Dynamic types like arrays are forced to be "fixed", which induce other constraints.

On Starknet, the storage uses the selector of each field name to ensure each field starts at a different slot in the storage. The proposal here is to add an other layout to the dojo database: hash. By leveraging the hashing function, we could replicate how Starknet storage works, without compromising the composability of Dojo database.

#[dojo::model(layout: 'hash')]
struct MyModel {
    a: u32,
    b: ByteArray,
    c: Array<felt252>,
}

// Serialized:
[a, b, c]
// Packed, but each one is packed individually.
[packed(a), packed(b), packed(c)]

Doing so, we gain two major things:

Upgradeability resiliency (fields can be removed, re-ordered).
Dynamic types are now supported. The downside of this approach is that we pay extra hashing for every single field, regardless of it's size.

The proposal here is to have both layout available in Dojo. The default layout would be switched to hash, to ensure a smooth experience with all dynamic types. And the user can then optimize the storage by switching to serial layout.

Implications:

This change will impact mostly dojo-core, dojo-lang and torii.

dojo-core: The world API will be updated to support those two layouts. In the case of hash layout, the selector of each fields will be passed to instruct in which order the fields must be gathered. And then packing layout is also something we can still support.
dojo-lang: The macro for get! and set! must now distinguish the model layout. Based on this, we can generate the appropriate implementation to retrieve the layout and values from the model data.
torii: When the hash layout is used, I'm not sure if Torii needs to be modified to support it as we already rely on serialization. However, the ByteArray (string) and Array<T> must be supported for indexation. We may start with only array of primitive types though.

To remind also the longer term vision, we also want to enable a user to update/retrieve only certain fields of a model. Both layout should support that.

Questions

Should we use hash layout by default?
Should we allow packing on / packing off (independently of the layout)?
Is this something missing in the Storage that you guys at higher level / client side feel we're missing?

Happy to have any feedback on that, to ensure we can provide to user a very smooth and configurable experience when it comes to storage and upgradeability.

Mar 14 '24 15:03 glihm

I don't think i understand this:

Dynamic types are now supported.

But I think my question might be useful for clarification anyway.

Let's say we have an array of this type

#[dojo::model(layout: 'hash')]
struct SimpleModel {
    a: u32,
    b: u8
}
let arr: Array<SimpleModel> = [ (10,11), (20,21), (30, 31) ] //Simplified notation but u get the point

How is the arr stored using the hash layout?

Mar 14 '24 16:03 piniom

I don't think i understand this:

Dynamic types are now supported.

But I think my question might be useful for clarification anyway.

Let's say we have an array of this type
#[dojo::model(layout: 'hash')]
struct SimpleModel {
    a: u32,
    b: u8
}
let arr: Array<SimpleModel> = [ (10,11), (20,21), (30, 31) ] //Simplified notation but u get the point 
How is the arr stored using the hash layout?

Dynamic types inside the model, because we store the model in the database. So it would rather be:

#[dojo::model(layout: 'hash')]
struct SimpleModel {
    a: u32,
    b: Array<u8>,
}

And in the storage with hash layout you will have:

selector("a") address (derived from model base address) -> [u32] (1 felt)
selector("b") address (derived from model base address) -> [size of array, elts...] (at least 1 felt, but can be more).

For the first iteration, we will not support that, only primitive types:

#[dojo::model(layout: 'hash')]
struct NestedModel {
    a: u32,
    b: Array<SimpleModel>,
}

Mar 14 '24 16:03 glihm

+1 for hash being default since arrays / bytearrays would be a common primitives. But I wonder how important is it to still support serial layout? If developers are looking to optimize for cost they could implement their own pack/unpack on single fields within the model? Much like what @notV4l is doing for RYO

Mar 14 '24 17:03 broody

+1 for hash being default since arrays / bytearrays would be a common primitives. But I wonder how important is it to still support serial layout? If developers are looking to optimize for cost they could implement their own pack/unpack on single fields within the model? Much like what @notV4l is doing for RYO

Good point. But isn't then your model like:

struct MyOneFelt {
    f: felt252,
}

And you have to do the pack/unpack yourself + the access to the fields?

When with the serial layout you can:

#[dojo::model(layout: 'serial')]
struct MyData {
    hp: u8,
    potions: u8,
    ...
}

// --> 1 felt in storage.

and this will be packed automatically by Dojo, which allows you to keep your game logic using the model naturally?

But as you mentioned, is it really an added value for Dojo to support that? We've to check how much it impacts the stack to have both. :+1:

Mar 14 '24 18:03 glihm

Iff you decide you want to support both layouts, wouldn't it be better to abstract the model serialization and de-serialization?

The benefit would be that a particular mechanic could be derived or custom implemented, much like serde.

Good point. But isn't then your model like:
struct MyOneFelt {
   f: felt252,
}
And you have to do the pack/unpack yourself + the access to the fields?

If this serde-like way is doable (I'm not familiar with dojo low-level to the sufficient extent) then you would only need pack/unpack. The model interface (fields) would remain the same.

It might be a two bladed sword thou, since when the implementation is written by someone else the burden of keeping the model upgradable is on the implementer. So added flexibility but also more room for errors.

I don't know if it makes sense but just a point to consider :)

Mar 15 '24 10:03 piniom

I'm on it !

Apr 05 '24 23:04 remybar

Reworked for 0.7.0 and new type support.

Jun 19 '24 21:06 glihm