ceph-nvmeof icon indicating copy to clipboard operation
ceph-nvmeof copied to clipboard

Enforce eager validation of configuration parameters

Open epuertat opened this issue 1 year ago • 0 comments

Right now config parameters (read from ceph-nvmeof.conf or from whatever indicated via the -c <config_file> with the Python configparser standard library) are validated whenever accessed. The potential validation issues are:

  1. The parameter is not defined. This is partially solved by 2 approaches:
    • Define defaults in the ceph-nvmeof.conf template. However, this complicates code maintenance, since configuration is partially detached from the code.
    • Define defaults at access time. This leads to potential inconsistencies between config file defaults.
  2. The parameter type is not correct (e.g.: rpc_socket is a number). configparser doesn't enforce any type of schema or static typing, hence all type checking is dynamic.
  3. The parameter is semantically correct (e.g.: rpc_socket is an invalid file).

This leads to a lazy validation approach, which is problematic, as once the gateway is initialized there's no way to fix configuration problems (the config file is read once at initialization phase).

This should be generally aligned with a 'fail-fast' approach refactoring. That is, fail ASAP if:

  • config parameters are invalid,
  • Ceph auth permissions are incorrect (right now there are situations where the gateway starts without proper permissions and only fails when it tries to write something to the RADOS state object).

Suggestions to fix this:

  • Enforce eager validation:
    • This has already been tried with home-brew approach: https://github.com/ceph/ceph-nvmeof/pull/126.
    • Using existing libraries, like configargparse, which leverages argparse syntax to enforce preemptive validation.
  • Define a schema for the config file:
    • Given configparse INI files can be easily converted into dicts, and dicts into JSON, we could easily use JSON Schema to perform a validation. However that only covers validation issues #1 and #2, and JSON Schema is not the friendliest thing to work with (and doesn't enforce configuration-as-code pattern either).

Related

  • https://github.com/ceph/ceph-nvmeof/issues/49
  • https://github.com/ceph/ceph-nvmeof/issues/96
  • https://github.com/ceph/ceph-nvmeof/pull/126

epuertat avatar Aug 22 '23 09:08 epuertat