augeas icon indicating copy to clipboard operation
augeas copied to clipboard

augeas modify too much lines

Open jreidinger opened this issue 7 years ago • 18 comments

Hi, in short problem is when I remove one comment from tree and save file, it modify content over whole file. I am mainly asking for help where to start with debugging as it is a bit usage stopper for augeas for us.

How I can reliable reproduce it:

jreidinger@linux-vvcf:~/prace/yast/packager/test/data/zypp> augtool -t 'Puppet incl /home/jreidinger/prace/yast/packager/test/data/zypp/zypp.conf'
augtool> defvar zypp /files/home/jreidinger/prace/yast/packager/test/data/zypp/zypp.conf
augtool> rm $zypp/main/#comment[47]
rm : $zypp/main/#comment[47] 1
augtool> save
Saved 1 file(s)
augtool> quit

and resulting diff is big, but in general lot of unnecessary spaces like

 ##
 ## Valid values: A directory
 ## Default value: /etc/zypp
-##
-# configdir = /etc/zypp
+#configdir = /etc/zypp
 
-##
+# #
 ## Path where the known repositories .repo files are kept
 ##

and I do not see any pattern in it. sometimes ## is changed to # # but sometimes not.

I welcome any hint how I can debug what causing it.

jreidinger avatar Mar 07 '17 09:03 jreidinger

Could you gist or upload the original file somewhere, so we can reproduce it? Please also include the version of Augeas you're using.

domcleal avatar Mar 07 '17 09:03 domcleal

yes, original file is at https://gist.github.com/jreidinger/59d5d2a471ff83a94cab409db2cbe184

version of augeas I am using is

jreidinger@linux-vvcf:~/prace/yast/packager/test/data/zypp> rpm -qa augeas augeas-1.2.0-8.2.x86_64

it is opensuse 42.2 version. I can check if there is any patch on top of it.

jreidinger avatar Mar 07 '17 10:03 jreidinger

I check opensuse patches and it is in non-relevant lenses and only one that modify augeas source code is security fix for escaping values which should not relevant to comment removal https://build.opensuse.org/package/view_file/openSUSE:Leap:42.2/augeas/bnc925225-aug_escape.patch?expand=1

jreidinger avatar Mar 07 '17 10:03 jreidinger

one idea, can it be caused by fact, that it modify #comment[] and maybe it mark all elements of this collection as dirty and it is modified? still it do not explain why some lines are modified and some not.

jreidinger avatar Mar 07 '17 10:03 jreidinger

another observation I have from my testing.

If I remove comment from different section that this do not happen. And only affected comments are after removed comment, so if I remove comment last comment, it works as expected. If I remove comment close to end, only last few comments are affected.

jreidinger avatar Mar 08 '17 08:03 jreidinger

Thanks for the sample file, I can reproduce the issue on master too. I would guess this is a bug in the Augeas library rather than the IniFile lens.

Slightly simplified test using IniFile: https://github.com/domcleal/augeas/commit/99ad03e0e44d7bacf50be1ee3fd66ddd2c3072a4

domcleal avatar Mar 08 '17 09:03 domcleal

I think what is happening here is that because a comment 'in the middle' is removed, all other comments slide up one in their formatting, i.e., when #comment[3] is printed, it uses the spacing/formatting that #comment[2] used to have etc.

That's the sort of surprise that PR #244 tries to address.

lutter avatar Mar 15 '17 01:03 lutter

@lutter interesting pull request, can I somehow help with it? like testing?

jreidinger avatar Mar 15 '17 08:03 jreidinger

@lutter it seems so. Analogously, when I add a comment in the middle, it uses the formatting of the comment that got pushed away.

mvidner avatar Mar 15 '17 12:03 mvidner

@lutter @mvidner this is definitely that, and the effects are expected this way. Using seq entries is one way to fix this (but comes with other inconveniences).

raphink avatar Mar 15 '17 13:03 raphink

@jreidinger if you want to play with that PR a bit that would be most appreciated. The big question in my mind is whether we should just change the behavior of [ .. ] to what < .. > in that PR does, or whether we should have both and change lenses over time as it makes sense. Basically: what is the impact of changing [ .. ] and how much does it break existing expectations ?

lutter avatar Mar 16 '17 23:03 lutter

@lutter I looked into the code and I cannot understand one thing, can you explain it to me?

If I got it right, on loading lns_get() is called which parses configs and builds a tree. After that all modifications are applied to this tree. When saving, lns_parse() is called, which parses configs again and builds a skel which is inconsistent with the modified tree. Then data from skel is used for writing modified configs.

What is the purpose of such double parsing which leads to inconsistency and, as a consequence, to this bug?

mikhirev avatar Feb 22 '18 11:02 mikhirev

@mikhirev the surprise doesn't come from the fact that the file is parsed twice, but from how entries in the tree are associated with the stuff that gets ignored during lns_get. The skeletons that lns_parse produces are basically templates into which stuff from the tree is mapped. If a lens produces many tree entries with the same label, those templates are used one after the other, in the same order in which they were found in the original file. That behavior is at the heart of what you are seeing here.

The advantage of this behavior is that you can delete a whole subtree, and if you recreate it exactly the same way it was before, you will also get the same output; in that sense, Augeas is idempotent, which is incredibly useful. The downside is that you have this 'shifting up' behavior that is discussed in this bug.

Therefore, this all is a tradeoff between two solutions, neither of which is ideal: either keep Augeas' idempotency or avoid shifting up. I'd love to come up with something that addresses both, but that's turned out to be pretty tricky.

lutter avatar Mar 05 '18 22:03 lutter

@lutter for me possible solution can be similar to what I did in my library on top of augeas. When something gets deleted I keep stub of such node and when something new appear, it replace that stub. But if nothing appear then in the end stub is really deleted, so in this case can remove line with it. This way it will be idempotent and also without shifting.

JFYI this is how I do writting of modified abstracted tree using augeas https://github.com/config-files-api/config_files_api/blob/master/lib/cfa/augeas_parser/writer.rb#L181

jreidinger avatar Mar 06 '18 07:03 jreidinger

@lutter this sounds reasonable. Unfortunately in my particular case minimal changes to files are more important than idempotency (I'm not going to delete entries and recreate them). The solution suggested by @jreidinger seems good: mark tree nodes as deleted instead deleting them immediately.

mikhirev avatar Mar 06 '18 10:03 mikhirev

@jreidinger Generalizing your approach to work with the full Augeas API would be really hard: in general, users can mix querying the tree and modifying it freely, so that if we keep deleted entries around until 'save', we'd need a lot of hairy bookkeeping to suppress those softdeleted entries when we return results of aug_match, make sure they don't get counted in constructs like foo[3], make sure they don't cause errors if the softdeleted tree is malformed, etc.

I am also not sure how this would help in the example that @domcleal posted above.

I am wondering if another approach would work: use the behavior from PR #244 for [ .. ] and let users switch between the current behavior and the new behavior with a flag - either another flag for aug_init or by setting a special node under /augeas.

lutter avatar Mar 08 '18 18:03 lutter

I just updated the PR #244 so that if you set the environment variable AUGEAS_NO_SHIFT to anything, you get behavior that should be much less prone to 'shifting up' (at the cost of idempotency)

This is good enough to make the example that @domcleal added to this issue pass (also now part of the PR) I would love to get some more feedback on whether this addresses your issues, @jreidinger and @mikhirev. If things work out, we can add a better way to enable this behavior than through an environment variable.

I am not entirely sold that this is the right way to go about it, as it has repercussions for tests etc. and before this could be merged we'd definitely need more work on integrating this better with the rest of Augeas. But all these things are only a concern if the new behavior actually addresses the problem here and doesn't cause new problems.

lutter avatar Mar 08 '18 19:03 lutter

Sorry for the long delay.

Code from PR #244 performs as desired when AUGEAS_NO_SHIFT is defined. But it would be more convenient to enable this feature via aug_flags, not (only) environment variable.

mikhirev avatar May 23 '18 15:05 mikhirev