substrate icon indicating copy to clipboard operation
substrate copied to clipboard

Create some compile-time notion of "runtime mode"

Open sam0x17 opened this issue 1 year ago • 9 comments

This came out of a conversation with @kianenigma, @ggwpez and other members of the FRAME team so I'll do my best to reproduce the highlights of that conversation here. The short version is that right now in a substrate-based project, we don't really have a well-defined, standardized notion of what the build "intent" or "mode" is at runtime, nor do we have this information in an easily-consumable form at compile-time, and this is something we could improve upon. I'll explain what I mean:

Some background:

Before I worked in Rust I was a long-time rubyist. Ruby on Rails (web framework used at a lot of startups) is known for having a very clear notion of what in the Rails world is called an "environment". When you run a Rails app, there are three default "environments" that come built-in: "test", "development" and "production". An app's environment is determined when it launches and cannot change while it is running, and this information can be accessed via globally accessible methods such as Rails.env.test?, Rails.env.production? etc., and is set by the environment variable RAILS_ENV.

When you run unit and integration tests, the app will automatically be in test mode (RAILS_ENV=test), and all sorts of configuration and things can then load and happen on an environment-specific basis (just like cfg(test) in Rust). It is quite common, for example, to have blocks of code that will do a switch statement that does different things for each of the defined environments.

The "development" environment is the default and is typically used for local development/tinkering. The "production" environment is used on deployed servers, and people also often add an additional "staging" environment that closely matches "production" but with subtle differences like different db credentials, etc. Secret management is also typically set up at the environment level.

Running development mode uses a lazy asset pipeline suitable for local development, while running the app in production mode will result in performance-optimized compiled assets, tighter security settings (CORS, etc), production-specific API keys and secrets, etc..

By standardizing these "environments" at the framework level, all tooling and plugins that integrate with Rails can make certain assumptions based on what environment we are in, and this is healthy for the ecosystem. Crucially, this allows gem (crate) authors to specify that certain features can or cannot be used in certain environments, and since all Rails apps are structured roughly the same way, this will just work out of the box for arbitrary Rails apps.

It would be great if we could rely on something conceptually similar to this when building with substrate!

My Suggestion

So given my background you can understand why when I started working on substrate, I was surprised we don't have some notion similar to this. For example, at build time the only thing we really know is whether we are in test mode or not, which doesn't tell us much. We don't (at least not in a way that is standardized and easily accessible at compile-time anywhere and everywhere in substrate, for example in individual pallets) know if we are polkadot itself, or a parachain, and we don't know whether we are in production or if we are just some random dev playing around with things locally.

As far as I can tell there are at least two notions of "environment" that would be useful for us in substrate:

  1. What this build is (i.e. polkadot, a parachain, etc.)... I would tentatively call this something like RuntimeType. This already exists in a limited fashion in some of the structs and traits generated by construct_runtime!, but right now this information is not easily consumed at compile-time in other contexts like in a pallet.
  2. Why this build is running (i.e. we are running tests, running benchmarks, doing local development/tinkering, some sort of staging server, production, etc)... I would tentatively call this something like BuildIntent or BuildMode or BuildEnvironment

It would make a lot of sense to formulate these as mutually exclusive cfg flags, though cfg flags have their own pitfalls and we could just as easaily implement something like this entirely without cfg flags.

If we did go the cfg flag route, it would be quite easy to gate things for example something like:

#[cfg(not(production))]
// unsecure stuff we don't ever want used in secure contexts here

For our purposes, it also might make sense to replace the word "production" with "secure", so we would make the "test", "development", and "secure" features mutually exclusive and require exactly one of them to be set at all times.

We technically already have the test environment because rust has a built-in test mode and we use cfg(test) in many places in the code.

It doesn't have to be cfg flags, again, but regardless, if we were to formalize automatically generated enums at compile-time that provide information like this, or produce cfg flags that accomplish the same thing, it would be much, much easier to do things like gate certain features from being used in "production" or in certain contexts, for example:

  • Our "pallet dev mode" (#12536) should not be used other than for local tinkering and possibly in some exotic test situations but right now this is unenforceable
  • Recently (#13301) we had to rename the randomness-collective-flip pallet to insecure-randomness-collective-flip to discourage its usage in production contexts because it is inherently insecure
  • There are many other scenarios similar to this where we wish we could raise a compiler error if someone tries to use X in Y scenario, but we have no way of systematically detecting that scenario because we simply don't have an abstraction for it.

While renaming knowingly insecure pallets to be prefixed with "insecure" is an effecitve counter-measure to some of these scenarios, situations like this could be made even safer if we had the granularity at compile-time to actually know whether this is a context where security is required and issue a compile error accordingly if the programmer is trying to use something inherently insecure in a secure context.

Another advantage of the cfg route, by the way, is Rust already has a strong notion of the test cfg, so we'd just be adding more and making them mutually exclusive (which is something I've done before).

Thoughts?

sam0x17 avatar Feb 07 '23 05:02 sam0x17