confuse icon indicating copy to clipboard operation
confuse copied to clipboard

Dumping changes to a config.yaml file places changed lines at the beginning

Open bjtho08 opened this issue 4 years ago • 7 comments

As the title says, when I make a change to some of the config variables and dump it to the config.yaml file, the edited variables are now inserted at the very top of the file rather than the position they used to be in. This is especially frustrating when the config file is "supposed" to start with a few lines of comments indicating authorship and explanations for new users.

OS and versions

OS: Ubuntu 18.04 Python: 3.8.2 64-bit confuse: 1.1.0

Steps to reproduce

  1. Create a config.yaml with some leading comments and some parameters on the following lines
  2. load a config.yaml file
  3. read a parameter from the file
  4. change the parameter
  5. dump the edited configuration to config.yaml
  6. open config.yaml in a text editor

Example config.yaml

############################################
#          AAA_image_segmentation          #
#               Config file                #
#          Author: Bjarne Thorsted         #
#      e-mail: bt*******@*************     #
############################################

appName: AAAml
# Usage instructions:
...
active_labels: [2, 3, 4, 5]

Example python script

import os
import confuse

os.environ['AAAMLDIR'] = os.curdir

config = confuse.Configuration("AAAml")
config.sources[0].default = True

active_labels = config["active_labels"].get()
config["active_labels"].set([2, 3])

with open("config copy.yaml", mode="w") as cfg_file:
    cfg_file.write(config.dump())

with open("config copy.yaml", mode="r") as cfg_file:
    print(cfg_file.read())

console output:

active_labels: [2, 3]
############################################
#          AAA_image_segmentation          #
#               Config file                #
#          Author: Bjarne Thorsted         #
#      e-mail: bt*******@*************     #
############################################

appName: AAAml
# Usage instructions:
...

bjtho08 avatar Jun 26 '20 07:06 bjtho08

Interesting use case! Indeed, this will not work yet. To clarify, our current round-tripping of comments is quite hacky, but it could plausibly be greatly improved by switching to ruamel.yaml (cf #52). Regardless, however, the scheme works by trying to preserve the formatting of a given source, whereas this use case is about merging multiple sources while preserving the formatting of one of them. It certainly sounds possible, but it will take some creativity to figure out exactly how to do that in a predictable way!

sampsyo avatar Jun 26 '20 11:06 sampsyo

Thank you for the reply. I think you may have misunderstood my use case (or I have not used confuse the way I think I have). I am trying to load a config.yaml, update one of the values in said file and then write the changes back to the same file. In the code above, I use "config copy.yaml" to make sure I don't ruin the original config file. I suppose I could just have used the copy as backup and revert if something got screwed up, but this way seemed like an easier approach.

bjtho08 avatar Jun 28 '20 12:06 bjtho08

Indeed—that's the tricky thing here. When you do config["active_labels"].set([2, 3]), you are actually not modifying any configuration data structure in place! Confuse works by "layering" together multiple configurations and then providing you a "merged" view on them. So .set() is non-destructive, in the sense that it is not modifying the original configuration—it is just creating a new configuration that has those new values in it, as an overlay on top of the old configuration.

That's why this is harder than it looks for Confuse to do: it needs to merge configuration sources together, even when they do not align perfectly, when you do dump.

sampsyo avatar Jun 28 '20 14:06 sampsyo

Well, now I am confused (pardon the pun). What is the purpose of preserving comment lines during dump(), if I can only write an unmodified version of the default config file? And am I correct in gathering from your response that as of version 1.3.0 I cannot modify a config file and dump() while preserving comments?

bjtho08 avatar Jun 28 '20 16:06 bjtho08

Well, here's the thing: you can actually dump a "changed" version of the config file, but the way that "change" works is pretty subtle. You're not actually modifying one data structure in place, as d[k] = v typically does in Python. Instead, Confuse works by maintaining multiple layers and merging them together. So when you dump a "changed" config, you do see the updated data—but because Confuse is preserving the illusion that data was actually modified, whereas it has actually kept the original data and the updates separate.

I think we may be getting a little side-tracked though. The main thing is that preserving the order of things when dumping, even after these "changes," would be great but it is more complicated than it sounds. I'd be interested to see if we can do it anyway but it will take some effort.

sampsyo avatar Jun 28 '20 19:06 sampsyo

As you mentioned earlier, ruamel.yaml has some features that should make it easier. Since, I haven't yet had a look at any of this, maybe you can tell me how far away you are from implementing something similar?

Also, I've read about bespON, which is somewhat similar to YAML and ships with complete round-tripping. Maybe there are some insights to be gleaned from the source code?

bjtho08 avatar Jun 29 '20 09:06 bjtho08

I don't currently have plans to implement it, so any help would be greatly appreciated!

However, part of the point of the discussion above was to emphasize plain round-tripping from the serialization library, while helpful, is not enough. There are also Confuse-specific issues related to the "layering" concept that would need to be resolved.

sampsyo avatar Jun 29 '20 12:06 sampsyo