valkey Making configuration item dir immutable

We've had security issues in the past with it, which is why we marked it as PROTECTED. But, modifying during runtime is also a dangerous action. For example, when child processes are running, persistent temp files and log files may have unexpected effects.

Aug 11 '24 10:08 enjoy-binbin

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Project coverage is 70.84%. Comparing base (6df376d) to head (d90000d). Report is 3 commits behind head on unstable.

Additional details and impacted files

@@             Coverage Diff              @@
##           unstable     #887      +/-   ##
============================================
+ Coverage     70.82%   70.84%   +0.01%     
============================================
  Files           118      118              
  Lines         63561    63561              
============================================
+ Hits          45020    45029       +9     
+ Misses        18541    18532       -9

see 13 files with indirect coverage changes

Aug 24 '24 10:08 codecov[bot]

I'd also be okay making this config immutable for Valkey 8.0. We've had security issues in the past with it, which is why we marked it as PROTECTED. It seems fine to throw these temporary errors here as well.

I actually seem to prefer making it immutable now, WDYT? @valkey-io/core-team

Aug 26 '24 15:08 enjoy-binbin

I'd also be okay making this config immutable for Valkey 8.0. We've had security issues in the past with it, which is why we marked it as PROTECTED. It seems fine to throw these temporary errors here as well.

I actually seem to prefer making it immutable now, WDYT? @valkey-io/core-team

I am ok to change this config from MODIFIABLE_CONFIG | PROTECTED_CONFIG | DENY_LOADING_CONFIG to IMMUTABLE_CONFIG | DENY_LOADING_CONFIG

Aug 28 '24 18:08 hwware

I'm okay also making it immutable.

Aug 29 '24 05:08 madolson

+1 on making dir immutable. I can't think of a reason why dir needs to be mutable. Have you guys seen otherwise?

Aug 29 '24 05:08 PingXie

Immutable is OK, as long as it works to set --dir multiple times at startup, in config file and on command line. I know the Fedora (or Debian?) maintainer guy was doing something like that.

Aug 29 '24 10:08 zuiderkwast

Immutable is OK, as long as it works to set --dir multiple times at startup, in config file and on command line. I know the Fedora (or Debian?) maintainer guy was doing something like that.

ok, i tested it manually and it's ok (behavior is the same as unstable) and i also added some tests hope we can cover it. Maybe someone else needs to test it manually to make sure i am not breaking anythings...

Aug 29 '24 10:08 enjoy-binbin

I have encountered scenarios where the dir needed to be modified, such as when the machine has multiple disks mounted in different directories, and the disk where the current directory is located becomes read-only due to a machine failure.

In such cases, users want to start a replica on another machine for migration. However, because the disk is read-only, a full synchronization cannot be performed (as RDB requires disk writes, which was not supported in earlier versions through diskless sync). In this situation, we recovered by modifying the dir to a usable disk.

This may not be a common case, but I am not sure if other users have other scenarios that require modifying the dir.

Aug 29 '24 10:08 soloestoy

Yes, this is indeed one of the uses of dir modification, i suddenly forgot about it. Speaking of read-only disk error, it will cause cluster-config-file saving to fail and cause the node to exit. Like the cluster is doing some cluster stuff, and the node update nodes.conf and fail and exit due to disk error. We once wanted to save nodes.conf somewhere else by changing dir, but in the end we gave up for some reasons (when system disk fails, it is not reliable to hang on the disk) and we are using a module to hack it.

And btw, i would like to add a configuration item to control the killing itself thing, do you think it is necessary? I remember @PingXie seemed to have rejected it (but i think it is a configuration item that can be controlled by the administrator). Its suicide is passive and difficult to control.

void clusterSaveConfigOrDie(int do_fsync) {
    if (clusterSaveConfig(do_fsync) == C_ERR) {
        serverLog(LL_WARNING, "Fatal: can't update cluster config file.");
        exit(1);
    }
}

Aug 29 '24 11:08 enjoy-binbin

In this situation, we recovered by modifying the dir to a usable disk.

Interesting, so maybe it's better to keep it mutable.

Can we prevent changing it when we have a child process running?

Aug 29 '24 11:08 zuiderkwast

I have encountered scenarios where the dir needed to be modified, such as when the machine has multiple disks mounted in different directories, and the disk where the current directory is located becomes read-only due to a machine failure.

In such cases, users want to start a replica on another machine for migration. However, because the disk is read-only, a full synchronization cannot be performed (as RDB requires disk writes, which was not supported in earlier versions through diskless sync). In this situation, we recovered by modifying the dir to a usable disk.

Interesting scenario, indeed. However, as @soloestoy mentioned, it's not a typical use case, and the concerns raised by @enjoy-binbin, such as temporary file leaks and split logs, are not a concern here IMO. I agree that keeping this config mutable makes sense but I see little value or need in hardening this code path. Instead, we should document it in the dir config documentation for admin awareness. Blocking the dir mutation due to active child processes would complicate matters in urgent situations, as child processes can run for extended periods. In such cases, admins would have to identify and terminate the right child processes, which could lead to errors when trying to unblock the disk-based full sync quickly.

Aug 29 '24 21:08 PingXie

Yes, this is indeed one of the uses of dir modification, i suddenly forgot about it. Speaking of read-only disk error, it will cause cluster-config-file saving to fail and cause the node to exit. Like the cluster is doing some cluster stuff, and the node update nodes.conf and fail and exit due to disk error. We once wanted to save nodes.conf somewhere else by changing dir, but in the end we gave up for some reasons (when system disk fails, it is not reliable to hang on the disk) and we are using a module to hack it.

Right, nodes.conf is a single point of failure (SPOF) in the current design/implementation. The reliance of an in-memory server on disks, which typically have a lower mean time to failure (MTTF) compared to RAM, is unfortunate. However, this is a separate concern from the discussion of dir's mutability, so maybe open a new issue for discussion?

Aug 29 '24 21:08 PingXie

And btw, i would like to add a configuration item to control the killing itself thing, do you think it is necessary? I remember @PingXie seemed to have rejected it (but i think it is a configuration item that can be controlled by the administrator).

Sorry @enjoy-binbin I completely lost the context on this one. Do you have a thread?

Aug 29 '24 21:08 PingXie

ok, the link is #635, i take it wrong.

Aug 30 '24 05:08 enjoy-binbin

ok, now there seem to be three options now:

Keep it mutable, add valkey.conf to explain the harm of modifying it
Keep it mutable, block it when there are child processes (the original appearance of this PR)
Keep it immutable (the current appearance of this PR)

vote?

Aug 30 '24 05:08 enjoy-binbin

In such cases, users want to start a replica on another machine for migration. However, because the disk is read-only, a full synchronization cannot be performed (as RDB requires disk writes, which was not supported in earlier versions through diskless sync). In this situation, we recovered by modifying the dir to a usable disk.

We have diskless replication now though, do we still need to worry about this case?

Sep 01 '24 02:09 madolson

We have diskless replication now though, do we still need to worry about this case?

yeah. although we can't say disk-based replication will never be used, diskless has been the default (since 7.0?). so this is even less of a concern IMO. I would vote for option 1.

Sep 01 '24 07:09 PingXie

I look carefully which scenarios this parameter will be used again, there are 3 cases:

rdb filename.
aof filename.
cluster config file.
logging

@soloestoy describe a typical use case, and maybe more cases are coming. And we have no idea how the clients behavior is, thus I think we should keep it mutable, but in valkey.conf we can highlight it is dangerous to change it during runtime.

Thus, I vote 1.

Sep 12 '24 18:09 hwware

In Zhao's example, when the disk is full, it is now possible to replicate the data to another node. If disk-based sync is enabled, they can just change to diskless and replication will work.

But if the disk is full and they just want to use another disk for AOF, RDB and nodes.conf, then mutable is more useful, maybe. Nodes.conf is not written by a fork, so this one is no problem.

If it is risky to change the dir when a fork is running, then I think we can prevent it with option 2. But CONFIG SET failing at random time is also annoying. :)

What is the worst thing that can happen if dir is modified during a fork? We will leave temp files on disk, described in the first commit message of this PR:

If dir is modified during aof rewrite, we will leave the tmp files on the disk. Because the rename operation of the main process in backgroundRewriteDoneHandler will fail.

Leaving temp files is maybe no disaster but it is a little annoying, if it is huge files.

@enjoy-binbin also mentioned logging: the child process will continue logging in the old dir. That seems like no problem to me. Will anything happen to RDB files? No crashes?

If it is only AOF files, then maybe we can do a similar trick as in #886? Allow dir to be changed, but store the dir that was used when the fork started.

Sep 12 '24 20:09 zuiderkwast

@enjoy-binbin also mentioned logging: the child process will continue logging in the old dir. That seems like no problem to me. Will anything happen to RDB files? No crashes?

If it is only AOF files, then maybe we can do a similar trick as in https://github.com/valkey-io/valkey/pull/886? Allow dir to be changed, but store the dir that was used when the fork started.

I remember that the RDB files are ok, there is no crash and the RDB rename thing is ok (it is different than the AOF one). And i do remember it is only affect AOF files, yean i take try to store the dir and see if it can work. The logging one is not a real problem too i think, just a point that may be affected.

Sep 13 '24 02:09 enjoy-binbin