cms icon indicating copy to clipboard operation
cms copied to clipboard

Change configuration format from JSON to TOML

Open magula opened this issue 2 years ago • 6 comments

In light of #119, this would change the format of the cms.conf and cms.ranking.conf configuration files from JSON to TOML. Apart from the obvious, the changes include:

  • Transforms the _section headers of the previous JSON format into TOML sections. The TOML sections of cms.conf correspond to predefined dataclasses in Python, i.e. there is an object for each TOML section that contains its attributes. E.g., keep_sandbox can be accessed as config.worker.keep_sandbox, while previously it would have just been config.keep_sandbox. Some variable names also change: config.admin_cookie_duration becomes config.aws.cookie_duration, and config.cookie_duration becomes config.cws.cookie_duration. This should structure the attributes better and provide more consistent key names.

  • The structure of cms.ranking.conf is left untouched, the file is just translated to TOML.

  • Includes a check if the config file that would be used is in JSON format. If so, the user is warned about the change and advised to tranform the config into the new format. He is also presented with an attempt at translating the file. This works by injecting values taken from the JSON config into a Jinja template of the bare TOML config. If an unexpected key is present in the JSON config, the key-value pair is put in a section called stray. When loading a TOML config, everything in stray is stored as a field of Config, so it can still be accessed as config.[keyname]. The automatic translation works only heuristically. The attributes are rendered in the Jinja template using tojson, so there is no guarantee the outcome will adhere to the TOML specification. One could use a TOML writer instead, but that would add another package dependency.

Note that these changes have not been tested thoroughly.

Any feedback is very welcome!

magula avatar Feb 01 '23 21:02 magula

Codecov Report

Merging #1224 (a460387) into master (f541efc) will decrease coverage by 0.20%. The diff coverage is 67.46%.

@@            Coverage Diff             @@
##           master    #1224      +/-   ##
==========================================
- Coverage   63.59%   63.40%   -0.20%     
==========================================
  Files         233      233              
  Lines       17151    17189      +38     
==========================================
- Hits        10907    10898       -9     
- Misses       6244     6291      +47     
Flag Coverage Δ
functionaltests 47.29% <67.46%> (-0.07%) :arrow_down:
unittests 43.96% <61.11%> (+0.02%) :arrow_up:

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
cms/db/util.py 55.55% <ø> (+2.22%) :arrow_up:
cms/io/service.py 72.78% <ø> (-0.64%) :arrow_down:
cms/server/contest/server.py 86.36% <ø> (ø)
cms/conf.py 73.52% <66.66%> (-12.08%) :arrow_down:
cms/io/web_service.py 91.48% <66.66%> (+4.25%) :arrow_up:
cmsranking/Config.py 72.61% <80.00%> (-1.20%) :arrow_down:
cms/server/contest/authentication.py 100.00% <100.00%> (ø)
cms/db/base.py 85.14% <0.00%> (-3.97%) :arrow_down:
cms/service/workerpool.py 63.88% <0.00%> (-2.78%) :arrow_down:
cms/server/admin/handlers/dataset.py 29.76% <0.00%> (-2.61%) :arrow_down:
... and 16 more

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

codecov[bot] avatar Feb 01 '23 21:02 codecov[bot]

Yes, not grouping the keys into sections, thus changing way less code is definitely safer, and preserving backwards-compatibility is also a big plus. I am indecisive on this, but I think I will change it back then. :)

Would you mind if we still renamed

  • cookie_duration -> contest_cookie_duration and
  • num_proxies_used -> contest_num_proxies_used,

but to preserve compatibility used the old names as aliases when loading the config? These changes should be small enough to reasonably test manually. Also, I would like to rename the "section comment" ScoringService to ProxyService, which seems more fitting. ScoringService only uses one of these attributes to find out whether it is worth to connect to ProxyService because the ranking is enabled.

Should we keep

  • the attributes' type annotations, and
  • the check in the config loader for each attribute key if that config attribute exists (up until now, a typo in a key would have gone unnoticed and the default value been used inadvertently)?

Naming the config file cms.toml also seems like a good idea, I will do that. I had only refrained from it to not change even more of the code. :smile:

PS: I am not sure I would be happy about moving much of the configuration to the database, actually. We tend to use the same basic configuration for new setups, and then only adapt where it is necessary. This is easily done with a configuration file I can just change and copy to new servers (in fact I have just started keeping these in a git repo). Also, even with our regular setup for our selection process, we usually initialize a new database once a year. If these things became a part of the database and the way to change them would be via the admin interface, we would have to write down what goes in there and then manually add it, or write a script to load the configuration into the database, which seems overly complicated. So my concerns are similar, I think, to what @lw said in #119.

magula avatar Feb 03 '23 15:02 magula

Would you mind if we still renamed

  • cookie_duration -> contest_cookie_duration and
  • num_proxies_used -> contest_num_proxies_used,

but to preserve compatibility used the old names as aliases when loading the config? These changes should be small enough to reasonably test manually.

Yeah that sounds reasonable (I assume these aliases would work both in the legacy JSON file and the new TOML one?)

Should we keep

  • the attributes' type annotations, and
  • the check in the config loader for each attribute key if that config attribute exists (up until now, a typo in a key would have gone unnoticed and the default value been used inadvertently)?

The type annotations are a big plus in my opinion 😄

As for the second point, one concern I have is that (if I remember correctly) sometimes for a contest we would remove keys from the configuration file since anyway they were set to the default value, thus creating a "minimal" configuration file that only ovverrides what is strictly necessary. Would this change break such "minimal" configuration files?

PS: I am not sure I would be happy about moving much of the configuration to the database, actually. We tend to use the same basic configuration for new setups, and then only adapt where it is necessary.

That's a great input, I will keep this use-case in mind. I think for now it's fine to continue using a configuration file, and if we move to DB-configuration then we should look into having a file representation anyway to support this use-case (e.g. exporting the settings to a toml file, and being able to restore them)

wil93 avatar Feb 04 '23 19:02 wil93

Yeah that sounds reasonable (I assume these aliases would work both in the legacy JSON file and the new TOML one?)

Yes, they will work in both.

As for the second point, one concern I have is that (if I remember correctly) sometimes for a contest we would remove keys from the configuration file since anyway they were set to the default value, thus creating a "minimal" configuration file that only ovverrides what is strictly necessary. Would this change break such "minimal" configuration files?

It would not enforce keys to be present in the configuration file, so such a minimal configuration file will still work just the same. It just warns about config keys that it does not know about, instead of loading them silently. (A warning on missing keys is also included right now, but I will remove that again when I revert the changes of the config structure.)

That's a great input, I will keep this use-case in mind. I think for now it's fine to continue using a configuration file, and if we move to DB-configuration then we should look into having a file representation anyway to support this use-case (e.g. exporting the settings to a toml file, and being able to restore them)

Sounds good, thanks!

magula avatar Feb 04 '23 20:02 magula

I removed the section headers (or table headers, as TOML calls it) in the last commit, so the structure is as before.

However, there seems to be no (legible) way to structure core_services and other_services if not as tables, because inline tables can not contain newlines. This is a problem because the top-level table ends with the first table, so nothing after that could be top-level again. Thus I moved these to the end of the TOML file in the last commit. They are now the only parts contained in a table, and they have to be at the end of the file, which seems unelegant. Do you see a better way to do this?

magula avatar Feb 09 '23 19:02 magula

Now that we have pytest as test runner, adding tests should be easier so it would be nice to add a few tests, for example:

  • a test to make sure that the values from the TOML / JSON file indeed are reflected in the config object
  • a test to verify the logic of choosing TOML over JSON when available, and falling back to JSON when not available

Let me know if you would be down to add these, if not I can try to spend some time on it when I get free.

wil93 avatar Jan 18 '24 23:01 wil93