substrate Metadata V15 breaking change: Features to include in V15

We would like to include the Runtime API info in the metadata and create a V15 release. The metadata has been stable for a while and would like to minimize the number of breaking changes from the user's perspective.

Considering this is a breaking change for the metadata, what other features would we like to include in the metadata V15 before making a release?

Metadata with Runtime API

To help client-side applications make runtime API calls, we would like to include the runtime API information in the metadata (for context: https://github.com/paritytech/substrate/issues/11648).

This feature would help tools like subxt, polkadot-js, and capi provide a uniform API into the runtime.

Furthermore, after merging the RPC Spec V2 API into Substrate, the current RPC methods calling into the runtime will be replaced by chainHead_unstable_call (the equivalent of the current state_call method). One example is fetching the metadata via state_getMetadata that would be replaced by a runtime API call to Metadata_metadata without parameters.

Discovering which runtime function is available, providing the parameters, and interpreting the result into the proper shape are tedious and prone to error steps. Including the runtime API info in the metadata would improve the user experience.

The decl_runtime_apis from Substrate will be modified to include the runtime API information in the metadata.

// CC: @jsdw @niklasad1 @bkchr @ascjones @athei @harrysolovay

Dec 15 '22 12:12 lexnv

It will be great if we can avoid breaking changes. It doesn't have to be breaking changes to expose additional runtime API information. Just add a new API for that.

Dec 15 '22 20:12 xlc

Hi,

Thanks @xlc for providing input on this issue.

Regarding exposing multiple flavors of the metadata, we have discussed something similar in issue: https://github.com/paritytech/substrate/issues/12370. As Pierre @tomaka mentioned in https://github.com/paritytech/substrate/issues/12370#issuecomment-1268215969:

We don't want to introduce multiple flavors of metadata. This makes everything more complex. Our number one priority in everything we design nowadays should be to reduce the complexity as much as possible.

Although it would be a benefit from the user perspective to not have a breaking change, adding separate layers/flavors of the metadata increases the complexity and would make the implementation of https://github.com/paritytech/polkadot-sdk/issues/291 and https://github.com/paritytech/substrate/issues/10057 more difficult. Considering we had 14 metadata changes in a relatively short span of time, I would incline towards a V15 release. Also the next v15 metadata would reflect an ergonomic boost for the end-user.

@bkchr what do you think?

Dec 19 '22 12:12 lexnv

There is no filter required. I am asking to add a new API runtimeApiMetadata in additional to the current metadata Old SDK simply ignores it. New SDK can use it to ensure the correct type.

I am really against breaking change of metadata because it will break EVERYTHING. Every exchanges, bots, hardware wallet, wallets, dApps, blockexplorer, scripts etc have to upgrade. Many of them are running perfectly otherwise. This is going to cost a lot of dev resources & money.

And what do we gain from this breaking change? Yeah runtime API metadata is super useful. But 99% of the current code doesn't need it and it is just a bad idea to force those code to upgrade for no benefits.

Dec 19 '22 22:12 xlc

I have to agree with @xlc here. Breaking will be too bad for the ecosystem and we should not do it unless absolutely necessary. Additionally, I wouldn't call it metadata "flavors" but versioned metadata. IMHO we should move to version 15 but still provide v14 under its original location.

We should a an additional fallible Metadata_metadata_at_version(Option<u32>) which returns the metadata at the requested version. It can of course fail since we don't support old versions forever. But supporting two versions at the same time is the absolute minimum to allow for a migration.

In this case supporting the old version is not hard as it is just the new version minus the runtime API section.

Dec 22 '22 15:12 athei

We should a an additional fallible Metadata_metadata_at_version(Option<u32>) which returns the metadata at the requested version. It can of course fail since we don't support old versions forever. But supporting two versions at the same time is the absolute minimum to allow for a migration.

Then we can also just introduce metadata v15 format "today" and then wait 3 months until we start returning it from any runtime.

The entire metadata is versioned from almost the beginning and the argument "any change will take quite a lot of dev resources" is a bad argument. This means we could never change anything and this is really bad. We are still soo early and people need to be able to adapt. If we give people 3 months or maybe more to update, it should be possible to adapt all your stuff to the new metadata and have all software updated. I mean we could maybe come up with something like, only one metadata version per year and always collect stuff, but completely stopping is a really bad idea. This would prevent any kind of innovation and I will not support this.

Dec 22 '22 23:12 bkchr

To what we should add:

https://github.com/paritytech/polkadot-sdk/issues/236
https://github.com/paritytech/frame-metadata/issues/43 https://github.com/paritytech/substrate/issues/12929 Both are going into the same direction. Now with scale-info, we could probably change the format completely. However, while thinking about this, it would maybe prevent any kind of "metadata merkalization".

Dec 22 '22 23:12 bkchr

Then we can also just introduce metadata v15 format "today" and then wait 3 months until we start returning it from any runtime.

Why? The old runtime API will still exist. Downstream tools will start using the new runtime API at their own pace. We will remove the old version of the metadata and runtime API once we are convinced that everybody who is willing to keep up has switched.

I am not saying we need to support old stuff forever. I agree that it is too early for that. It is substrate's big advantage that it actually can change. I am merely advocating to allow for a transition period where we have the old and the new one supported at the same time. That will make it much much less of a world breaking event.

Dec 27 '22 07:12 athei

Metadata with Runtime API

Developers add runtime API in the following manner:

Define a trait using the decl_runtime_apis macro
Implement the API using the impl_runtime_apis macro

The first step contains helpful information as the documentation. At the same time, the second step is where we know which traits (runtime APIs) are implemented for which runtime.

This example declares a Metadata runtime API: https://github.com/paritytech/substrate/blob/ed3f055907116e7c194c1e5d0f884b8329ba70cb/primitives/api/src/lib.rs#L733-L737

Users that want to call the metadata function need to make an RPC call (state_call) to the Matadata_metadata function name and provide hex-encoded scale-encoded argument bytes (in this case 0x). The user is expected to interpret the resulting bytes as OpaqueMetadata.

The function name is obtained by concatenating the trait name with the method name (note: there is also an API versioning that we should take into account).

Frame-Metadata

The frame-metadata crate stores the type of metadata and the latest version is v14.

The v15 metadata extends the v14 with the runtime field. This field (similarly to the pallets field) contains all the information needed to generate an API in higher-level projects (subxt, capi etc).

The runtime field is a collection of TraitMetadata objects that contain:

trait name
optionally a trait version
methods available:
- method name
- method inputs (name and meta::Type - ie bytes: &[u8])
- method output (meta::Type)
- documentation
documentation (that is only visible at decl_runtime_apis and valuable for UX)

Substrate

The decl_runtime_apis! and impl_runtime_apis! macros are defined in primitives/api/proc-macro. While the metadata V14 is constructed by frame/support/procedural via the construct_runtime! macro.

All three macros generate crate access: construct_runtime! has access to the frame-support crate (including things like scale-info and frame-metadata) while decl_runtime_apis and impl_runtime_apis have access to the sp-api crate.

The decl_runtime_apis is extended to generate the TraitMetadata for each runtime API. Although, this could happen in the impl_runtime_apis macro the documentation is no longer available at that point.

For populating the metadata, the decl_runtime_apis needs access to:

scale_info::TypeInfo: describe runtime types
frame_metadata::v15::Metadata: to construct the metadata info of the runtime types

Those crates are not available in the sp_api, therefore the crate needs to export scale-info and frame_metadata too. All the substrate types that are part of a runtime declaration must also implement the scale_info::TypeInfo trait.

The impl_runtime_apis macro implements a method on the runtime to expose the TraitMetadata information collected for each runtime trait implemented. This is similar to the Runtime::metadata() method exposed by the construct_runtime! macro.

To allow a smooth transition for developers and avoid a sudden breaking change in the ecosystem, the Metadata trait is extended with a metadata_v15. The Metadata_metadata runtime function will still return the V14 metadata, while Metadata_metadata_v15 will return the new enriched metadata.

After a few months, the Metadata_metadata will return the V15 and the Metadata_metadata_v15 is deprecated.

Summary

Frame-metadata:

Implement Frame-Metadata V15

Substrate:

Export scale-info and frame-metadata by sp-api
Derive scale_info::TypeInfo for runtime types
decl_runtime_apis: generate TraitMetadata for each trait
impl_runtime_apis: collect TraitMetadata and expose Runtime::runtime_metadata() method
Extend Metadata trait with Metadata::metadata_v15 method give the community time for transition
Metadata_metadata method to return: frame-metadata::v15 including runtime API metadata

PoC

frame-metadata: metadata containing all details about the runtime
substrate: generate runtime metadata at impl_runtime_apis without documentation
subxt: use v15 and slim code-gen for the API

// CC: @jsdw @niklasad1 @bkchr @ascjones @athei

Jan 23 '23 12:01 lexnv

Why? The old runtime API will still exist. Downstream tools will start using the new runtime API at their own pace. We will remove the old version of the metadata and runtime API once we are convinced that everybody who is willing to keep up has switched.

One problem of doing this is that it would prevent of "merklized metadata" where we include the metadata storage root in the signing process to ensure that no one can fake the metadata. If we would have different flavours of the metadata.. While writing this I realized that we could just include both versions of the metadata storage roots in the signing process.

We should a an additional fallible Metadata_metadata_at_version(Option<u32>) which returns the metadata at the requested version. It can of course fail since we don't support old versions forever. But supporting two versions at the same time is the absolute minimum to allow for a migration.

Okay, I'm releasing my blocking on this :P I'm fine with doing this now :P

Jan 24 '23 12:01 bkchr

To allow a smooth transition for developers and avoid a sudden breaking change in the ecosystem, the Metadata trait is extended with a metadata_v15. The Metadata_metadata runtime function will still return the V14 metadata, while Metadata_metadata_v15 will return the new enriched metadata.

Why do not use the more generalized API I proposed? Adding Metadata_metadata_v15 will require new functions every time we update the metadata.

While writing this I realized that we could just include both versions of the metadata storage roots in the signing process.

Not updated clients only know about about the old storage root. So they cannot include both (or all currently valid metadata versions). Instead, a runtime needs to check the signature against all currently supported runtime versions (usually 2).

Okay, I'm releasing my blocking on this :P I'm fine with doing this now :P

Nice

Jan 24 '23 19:01 athei

Not updated clients only know about about the old storage root. So they cannot include both (or all currently valid metadata versions). Instead, a runtime needs to check the signature against all currently supported runtime versions (usually 2).

We could return the metadata storage roots of all supported metadata versions. The client only needs to ensure that one of the storage roots matches the metadata it supports.

Jan 24 '23 21:01 bkchr

Why do not use the more generalized API I proposed? Adding Metadata_metadata_v15 will require new functions every time we update the metadata.

I'm ok with either, but the pros and cons of adding methods like Metadata_metadata_v15 etc going forwards as I see it:

Pros:

We'll be able to see in the runtime API info in the metadata which previous versions of metadata are supported going forwards ("it'll show Metadata_metadata_v15, Metadata_metadata_v16 etc).
We can start by adding something like Metadata_unstable_metadata_v15 while this feature is in development. That'll let us incrementally add things (eg Alex can work on adding Runtime APIs and others can add this sort of stuff if they think it important https://github.com/paritytech/substrate/issues/12939#issuecomment-1363434115). When we're happy we can rename it to Metadata_metadata_v15 to mark it as stable.
We have the option to return the more specific metadata type if we want (eg v14::RuntimeMetadataV14 rather than the RuntimeMetadata enum). Maybe we don't want to do this though because of hashing for signing tx's or whatever though.
We could add a Metadata_metadata_latest if we want the ability to always get the latest (stable) metadata back anyway.

Cons:

New runtime call needs adding for each metadata bump (is this annoying to do?)
Users will have to tweak the call they make when "updating" to use new metadata versions (but since they also likely need to add support for working with the new metadata version, I think that this is trivial and explicit, and they would need to bump the version they request anyway).

So on balance I think I prefer adding an explicit method per new version unless there is a reason I've overlooked that makes it a horrible idea :)

Particularly It would be great if Alex could work on adding and merging Runtime API support behind an unstable method independently from any other features people would like to get into this new version.

Jan 25 '23 11:01 jsdw

New runtime call needs adding for each metadata bump (is this annoying to do?)

It is. Because every runtime needs to implement them. While with a general API they do that once and get automatically upgraded when FRAME adds this version. It is an additional layer of friction. We need to make the transition as easy as possible or we will lock ourselves into a corner. We will not be able to evolve the format because we get to afraid of the backlash.

We'll be able to see in the runtime API info in the metadata which previous versions of metadata are supported going forwards ("it'll show Metadata_metadata_v15, Metadata_metadata_v16 etc).

Add another function which returns the list of supported versions then.

We can start by adding something like Metadata_unstable_metadata_v15 while this feature is in development. That'll let us incrementally add things (eg Alex can work on adding Runtime APIs and others can add this sort of stuff if they think it important https://github.com/paritytech/substrate/issues/12939#issuecomment-1363434115). When we're happy we can rename it to Metadata_metadata_v15 to mark it as stable.

I don't see that as a strong argument. Just don't call Metadata_metadata_at_version(15) while it is still unstable. While encoding it into the name might be a bit stronger of a signal than in a doc comment I don't think it justifies adding a new runtime API. Please keep in mind users would still need to go out of their way to bump this number. They will only do that when they need a feature from that metadata. And if they learned about this feature they need this is also where they learned about this instability.

So on balance I think I prefer adding an explicit method per new version unless there is a reason I've overlooked that makes it a horrible idea :)

The friction of copy pasting those runtime impls into every single runtime. Boilerplate will be the end of us.

Jan 25 '23 12:01 athei

Would we need to maintain all the runtime functions that we have in place? Eventually, we could remove the _v15 _v16 methods and always keep the next version for migration purposes. This way we'll just rename the _v15 -> _v16 and make any macro modifications to achieve the result. The user transition will become as seamless as one rename. We could go even one step forward and name this: Metadata::get_next_unstable_metadata().

Presume we have:

/// Substrate perspective
impl_runtime macro generates
Runtime::get_next_metadata() // we modify this as we see fit for the V15, V16 and onwards format

/// User/Runtime perspective
impl_runtime {
Metadata {
    fn metadata() ... // Same as before

    /// Upon updates, the users will modify s/v15/v16
    /// Or name this `get_next_unstable_metadata` and never have to modify things
    fn metadata_v15() -> Metadata {
        Runtime::get_next_metadata()
    }
}

Just out of curiosity, would removing functions cause any issues with substrate?

Versioning

Another thing that I wanted to point out is the trait version generated via #[api_version(X)]. Do we have a strong enough reason to generate the runtime API for older versions? If we don't, one advantage is that we could keep the metadata size as slim as possible: add runtime details only for the latest trait version, as opposed to every version. What do you think?

Jan 25 '23 12:01 lexnv

Considering this is a breaking change for the metadata, what other features would we like to include in the metadata V15 before making a release?

On the topic of what to add to the metadata, pallet documentation (as discussed in paritytech/frame-metadata#47) would be very useful for building documentation sites (as we're planning to for capi).

Jan 25 '23 13:01 tjjfvi

I don't get why you are so keen on encoding the version in the name of the runtime API. Can you explain? This will me a massive headache because the runtime authors need to implement those APIs based on whether FRAME exposes those APIs. This is just friction. For what?

It doesn't change anything about the amount of versions you need to maintain. It is the same thing. Just without all the boilerplate of adding and removing runtime APIs.

Jan 25 '23 13:01 athei

I was exploring maybe removing the Option<u64>, such that we don't need to modify anything in the runtime. And to also avoid the need to expose another function that returns the list of supported versions.

I don't quite have the whole context into the runtime, yet so I might be wrong: Would a method that returns just the latest next metadata suffice for our use cases?

fn get_next_unstable_metadata() -> Option<OpaqueMetadata>

Tho, I have no strong feelings about this, and the version_at() would work just as well, but that feels a bit more involved from my perspective.

Jan 25 '23 13:01 lexnv

If there are two methods for getting the metadata along the lines of get_metadata and get_next_unstable_metadata, I suspect that applications would have logic along the lines of (pseudo-code):

let metadata1 = get_metadata();
if (is_supported_version(metadata1)) {
  return metadata1;
}
let metadata2 = get_next_unstable_metadata();
if(is_supported_version(metadata2)) {
  return metadata2;
}
// error, unsupported version

Which could end up sending two copies of the metadata over the network.

In contrast, version_at would only ever need to send one copy of the metadata.

Jan 25 '23 14:01 tjjfvi

A side note: applications may very well support multiple versions of the metadata, so having a method which specifies a version range (at which point the chain would send the highest version it supports in range) could be beneficial.

Jan 25 '23 14:01 tjjfvi

I agree we should not use Option<u64> for version_at(). I propose the following API:

fn metadata_at_version(version: u32) -> Option<Metadata>;
fn metadata_versions() -> Vec<u32>;

The original metadata version function will stick around as long as we support version 14.

We need to think about how these functions will be consumed. I see two different users:

A user which can only understand one version and is hardcoded to it. This will just use:

metadata_at_version(my_version) and just fail gracefully if None is returned. Compare this to trying to call a non existent function. You will need some kind of reflection or manually parse the metadata to discover existing functions.

A user which supports more than one function:

Will call metadata_versions() as checks of one of those versions can be supported. Again, adding separate functions will lead us to reflection or other weird hacks.

Tho, I have no strong feelings about this, and the version_at() would work just as well, but that feels a bit more involved from my perspective.

I don't think it is much harder than having separate functions. Just a simple dispatcher based on the supplied version.

A side note: applications may very well support multiple versions of the metadata, so having a method which specifies a version range (at which point the chain would send the highest version it supports in range) could be beneficial.

See above. Node returns supported versions and the client selects. Let's not go all TLS on that and end up in a complicated negotiation protocol.

Jan 25 '23 14:01 athei

I agree we should not use Option<u64> for version_at(). I propose the following API:
fn metadata_at_version(version: u32) -> Option<Metadata>;
fn metadata_versions() -> Vec<u32>;

That is also what I would have proposed and I think this is the best way forward. A new method per metadata version isn't smart. As @athei already said, with these two functions we can hide the rest in FRAME and the user doesn't care. They only need to enable the features in frame-metadata for the metadata versions they want to support.

Jan 25 '23 14:01 bkchr

Cool! That makes a lot of sense! Thanks for the clarifications!

Another thing that I wanted to point out is the trait version generated via #[api_version(X)]. Do we have a strong enough reason to generate the runtime API for older versions? If we don't, one advantage is that we could keep the metadata size as slim as possible: add runtime details only for the latest trait version, as opposed to every version. What do you think?

Regarding the versioning of the runtime APIs themself, do you believe it may be of use to call an older version? (ie BabeApi::fn configuration() -> BabeConfigurationV1

https://github.com/paritytech/substrate/blob/901506114ec3102d245049ab0f89d89190e30c21/primitives/consensus/babe/src/lib.rs#L384-L391

Jan 25 '23 14:01 lexnv

Regarding the versioning of the runtime APIs themself, do you believe it may be of use to call an older version? (ie BabeApi::fn configuration() -> BabeConfigurationV1

Or more specifically; should the metadata include information about older versions of calls?

I think for Subxt it's sufficient that we can generate an interface capable of calling the latest versions of things, and expect users to make more "manual" calls for prior versions, but there may be cases out there for knowing about old versions too.

I agree we should not use Option for version_at(). I propose the following API:
fn metadata_at_version(version: u32) -> Option<Metadata>;
fn metadata_versions() -> Vec<u32>;

Ok; the arguments against a new method per version make sense (and I'm happy with this suggestion)! And I'd assume that metadata_versions() won't return any "unstable/in-progress" versions that we're working on, so that we can "hide" the V15 impl until it's ready.

Jan 25 '23 14:01 jsdw

Or more specifically; should the metadata include information about older versions of calls?

No. Each runtime always only supports one version of the runtime function. The earlier versions are present in earlier runtimes and can not be used anymore in runtimes which implement the newest version.

Jan 25 '23 14:01 bkchr

Or more specifically; should the metadata include information about older versions of calls?

No. Each runtime always only supports one version of the runtime function. The earlier versions are present in earlier runtimes and can not be used anymore in runtimes which implement the newest version.

In the Babe example above: Aren't both versions of the configuration function available?

Jan 25 '23 14:01 athei

In the Babe example above: Aren't both versions of the configuration function available?

To the node side, yes. But on the runtime side you only have one. For the node side this is important to be able to call the old version and have some way of versioning the calls. But the metadata is always from the view of the current runtime and the current runtime only supports one version of the call.

Ok; the arguments against a new method per version make sense (and I'm happy with this suggestion)! And I'd assume that metadata_versions() won't return any "unstable/in-progress" versions that we're working on, so that we can "hide" the V15 impl until it's ready.

metadata_versions it will return all available metadata versions. These versions are determined based on the features activated in for the frame-metadata crate. This means, that you should not enable unstable versions on main net or people should be aware that they are unstable.

Jan 25 '23 14:01 bkchr

metadata_versions it will return all available metadata versions. These versions are determined based on the features activated in for the frame-metadata crate. This means, that you should not enable unstable versions on main net or people should be aware that they are unstable.

I guess I had in mind that, like the chainHead_unstable_x rpc methods, the V15 metadata could by default be available in live nodes (but not "exposed" via the version list) to give us the opportunity to try it out in tools like Subxt before it is stabilised and exposed.

How would you see the new metadata work (adding the new runtime APIs and the macro work to build the metadata up etc) being done in Substrate? Would it be hidden behind some unstable feature flag (that when enabled would also enable some frame-metadata unstable feature flag)?

Jan 25 '23 15:01 jsdw

To the node side, yes. But on the runtime side you only have one. For the node side this is important to be able to call the old version and have some way of versioning the calls. But the metadata is always from the view of the current runtime and the current runtime only supports one version of the call.

I see. It is just to import historical blocks. Agreed then we only include the most recent version in the metadata.

metadata_versions it will return all available metadata versions. These versions are determined based on the features activated in for the frame-metadata crate. This means, that you should not enable unstable versions on main net or people should be aware that they are unstable.

I agree. We should return even unstable versions there. Because code that automatically uses the newest version doesn't make sense even without the existence of unstable function: If a new stable version appears it will still break their code. The discoverability is only for UIs to support older versions not newer ones.

Jan 25 '23 15:01 athei

How would you see the new metadata work (adding the new runtime APIs and the macro work to build the metadata up etc) being done in Substrate? Would it be hidden behind some unstable feature flag (that when enabled would also enable some frame-metadata unstable feature flag)?

Will we really need to have this on master to test the new metadata with subxt and whatever? I mean when the new metadata version is added to the crate, we will need to update in Substrate and start using it. In your runtime you can then enable the feature the frame-metadata crate.

Jan 25 '23 15:01 bkchr

I guess I had in mind that, like the chainHead_unstable_x rpc methods, the V15 metadata could by default be available in live nodes (but not "exposed" via the version list) to give us the opportunity to try it out in tools like Subxt before it is stabilised and exposed.

I agree. And I don't think anything contradicting this was written here. Mainly that any production tool would not use the latest unstable version. When you are testing against the new version you would of course supply the unstable version. It is available but any sensible client would not use it for anything but testing.

How would you see the new metadata work (adding the new runtime APIs and the macro work to build the metadata up etc) being done in Substrate? Would it be hidden behind some unstable feature flag (that when enabled would also enable some frame-metadata unstable feature flag)?

Please don't. No more feature flags. We just document that the version is unstable. Or maybe even just give it a temporary number like u32::MAX while it is unstable.

Jan 25 '23 15:01 athei