rfcs Discussion: consider backward compatibility

trafficstars

I think we should start thinking backward compatibility during development given that we have users. https://risingwave-labs.slack.com/archives/C03L8DNMDDW/p1674026516400229?thread_ts=1674020631.181329&cid=C03L8DNMDDW

previous breaking-change https://github.com/risingwavelabs/risingwave/issues?q=label%3Abreaking-change

There are many aspects to consider for compatibility

SQL syntax
data format in storage/meta
protobuf
- #43
CLI args/config file (find a reference from databend)
...

However

Maintaining compatibility is hard and may not worth it. If there is a need to upgrade version for the already deployed cases, we can write a simple tool to migrate old metadata to the new one on demand. https://github.com/risingwavelabs/risingwave/pull/7281#issuecomment-1378290900

So I just create an issue first to collect ideas :)

Jan 27 '23 13:01 xxchan

That's a good question, and I've thought a lot about compatibility, so let's take it case by case.

SQL syntax

After our first stable release, it's better to always keep SQL compatibility. Generally speaking, this won't be too hard, since most of our SQL is just following how Postgres behaves.

We can mark a stabilized syntax as deprecated, but never have to remove it. I guess it doesn't take much effort.

For very rare design flaws, used by few users, break changes are allowed. We should discuss them case by case.

Example: https://github.com/risingwavelabs/risingwave/pull/4503 doesn't break anything, and we may still find the better property naming convention in the future.

Data format in storage

SST format

All objects stored in our storage backend are in our private SST format. There is an SstableMeta in the header of each SST file, with 4 bytes of version stored in it. I think it's flexible enough for further break changes.

Row Encoding

The SST data blocks are organized as rows of key/value pairs.

We use memcomparable encoding for the key and value encoding for the value. There's a lot of design space, and we've even made several break changes now. After we stabilize, we need to ensure that any old-version key/value pairs can be decoded correctly.

Fortunately, we can ensure that all key/value pairs in the same SST file have the same encoding version, so we don't need to store the version per row.

IMHO, a good practice to maintain encoding compatibility is to use separate crates and versions for the row encodings. Based on the power of cargo, we can use different versions of the same crate in one project.

For example:

memcomparable_0_1 = { package = "memcomparable", version = "0.1.2" }
memcomparable_0_2 = { package = "memcomparable", version = "0.2.3" }
memcomparable_1 = { package = "memcomparable", version = "1.4" }
value_encoding_0_1 = { package = "value_encoding", version = "0.1" }
value_encoding_1 = { package = "value_encoding", version = "1.2" }
value_encoding_2 = { package = "value_encoding", version = "2.1" }

And dispatch the implementation statically while decoding the SSTs.

Pseudocode:

fn decode<const SST_VERSION: usize>(buf: &Bytes) -> (Key, Value) {
  match SST_VERSION {
    1 => (memcomparable_0_1.decode(buf), value_encoding_0_1.decode(buf))
    2 => (memcomparable_0_2.decode(buf), value_encoding_1.decode(buf))
    3 => (memcomparable_0_2.decode(buf), value_encoding_2.decode(buf))
    4 => (memcomparable_1.decode(buf), value_encoding_2.decode(buf)),
    _ => unreachable!("corrupted data")
  }
}

The encoding crate version should follow the semver rules, so that we can release some patches (e.g. bump deps) to the older versions.

Upgrade During Compacting

When a new storage version (both SST and row encoding) is released, no data is changed until the SST is compacted. Compacting will rebuild the SST file so that we can upgrade the version in the meantime.

Compaction always reconstructs an entire SST file, so we can ensure that all key/value pairs in the same SST file have the same encoding version.

Data Format in Meta

Data in the meta is organized as a tree structure, and currently the major meta backend is etcd. The meta values are encoded by protobuf encoding.

If a change only adds/reduces fields in the meta, then the protobuf can ensure that nothing is broken.

If a change must reorganize the tree structure itself, then a break change must be made.

Since the meta-store data is never too large, a migration script is preferred for such a break change.

If we implement a SQL-based metastore later, we can use some SQL like migrations/5_to_6.sql to do the migration, but currently a Rust snippet is recommended.

Protocols between nodes

In general, the protobuf can handle the situation well enough.

To make things easier, we can add a version field to each request, so that an older compute node can always reject some DDL operations from a newer frontend.

For CN-CN communication (mostly exchange), protobuf compatibility should be handled carefully by developers.

CLI args/config file

No ideas, we should create a good enough configuration framework before the first release.

Jan 28 '23 06:01 TennyZhuang

In production, there will always be cases in which we upgrade a RisingWave cluster. The problem is that upgrading isn't an atomic process, meaning multiple node versions can exist. One simple solution is to fully stop the cluster and start the newer version afterward, but the connections from our user side will be disconnected. Apparently, it hurts user experience and thus shouldn't be the first choice in most cases.

Compatibility of the communication protocols should be carefully kept and tested for those promised compatible versions, i.e., versions with the same major (and minor) numbers in semantic versions.

For the CLI args and config parameters, I'd like to propose that the unrecognized parameters should trigger warnings instead of panics. This makes the cloud easier to manage the config templates. Also, critical args should be carefully handled because the operator can not distinguish from versions. If the operator also applies the arg changes (e.g., supports a new argument or deprecates an old one), the older RisingWave versions won't be deployable anymore. That has already happened several times.

Jan 28 '23 07:01 arkbriar

What about testing. Is it possible to setup some kind of compatibility test in CI? 👀

Jan 28 '23 07:01 wangrunji0408

In production, there will always be cases in which we upgrade a RisingWave cluster. The problem is that upgrading isn't an atomic process, meaning multiple node versions can exist. One simple solution is to fully stop the cluster and start the newer version afterward, but the connections from our user side will be disconnected. Apparently, it hurts user experience and thus shouldn't be the first choice in most cases.

I'd prefer always rolling-update the whole cluster, with specified orders. Two versions of nodes should only co-exist in several seconds to minutes, so we can reject some older requests directly.

Jan 28 '23 07:01 TennyZhuang

What about testing. Is it possible to setup some kind of compatibility test in CI? 👀

Not ready, we haven't implemented all mechanisms, just some thoughts.

Jan 28 '23 07:01 TennyZhuang

I'd prefer always rolling-update the whole cluster, with specified orders. Two versions of nodes should only co-exist in several seconds to minutes, so we can reject some older requests directly.

I agree, but it could take a long time to upgrade from a really old version.

Jan 28 '23 09:01 arkbriar

Protocols between nodes In general, the protobuf can handle the situation well enough. To make things easier, we can add a version field to each request, so that an older compute node can always reject some DDL operations from a newer frontend. For CN-CN communication (mostly exchange), protobuf compatibility should be handled carefully by developers.

Although this section takes up very little space, I think it is really difficult. We usually change or remove the field in the plan's protobuf. And those fields are not really suitable at first. Some of the executors even change their meaning and behaviors.

If we really want to give backward compatibility, maybe we need to mark some released features nightly and give up upgrade support for them. And in the kernel side, we should do more detailed check. E.g. we need check all protos and add comments to mark them stable or nightly, so that we can know only when all related proto is stable, a feature can be stable. And the developer can know which proto need long-term maintenance.

Jan 30 '23 08:01 st1page

https://github.com/risingwavelabs/rfcs/issues/41#issuecomment-1407307168

What @TennyZhuang mentioned is absolutely essential that we should ensure the extensibility of our major components. But the reality is, even some trivial configuration changes could break the compatibility: https://github.com/risingwavelabs/risingwave/pull/7527. Without actually testing out an upgrade, we'll never know what could break. @arkbriar @sumittal Let's include upgrade testing in our release process.

Jan 30 '23 08:01 neverchanje

@neverchanje I've commented under https://github.com/risingwavelabs/risingwave/pull/7530#issuecomment-1408237650 and proposed to keep the backward compatibility for at least one patch version so that the change won't block any others at the exact time it is made to make others life easier.

Jan 30 '23 09:01 arkbriar

Let's include upgrade testing in our release process.

Of course we can add an upgrade test during the release process, but what if the test fails? Whether to roll back the kernel or upgrade the cloud, such break changes will cause a lot of trouble, and even affect customers, because the cloud operator is public in eks, and not all problems can be solved by upgrading the operator. We should formulate more detailed compatibility rules.

Jan 30 '23 09:01 huangjw806

As pointed out by @st1page, the plan node proto (actually fragment node) need to be kept compatible, otherwise the existing streaming jobs will fail.

Jan 30 '23 09:01 fuyufjh

but what if the test fails?

@huangjw806 We shouldn't release any stable versions with upgradability issues, whether or not it's a patch release or a major release. If the migration tools are not ready, we should revert the changes that cause the issues.

Jan 30 '23 10:01 neverchanje

We usually change or remove the field in the plan's protobuf. And those fields are not really suitable at first. Some of the executors even change their meaning and behaviors.

Many design decisions behind Protobuf involve compatibility considerations, and tbh I don't know much about them and need to be educated 🥹. BTW buf supports breaking change detection: https://docs.buf.build/breaking/overview

Jan 30 '23 13:01 xxchan

draft a RFC to discuss how to maintain and manage the Backward Compatibility of the stream plan and SQL features. need more discussion https://github.com/risingwavelabs/rfcs/pull/43

Jan 31 '23 10:01 st1page

#43 also contains valuable discussions about the requirements for rolling upgrade.

Feb 10 '23 09:02 xxchan

rfcs rfcs copied to clipboard

Discussion: consider backward compatibility

SQL syntax

Data format in storage

SST format

Row Encoding

Upgrade During Compacting

Data Format in Meta

Protocols between nodes

CLI args/config file

rfcs
rfcs copied to clipboard