scons icon indicating copy to clipboard operation
scons copied to clipboard

Don't let deletions leak from OverrideEnvironment

Open mwichmann opened this issue 1 year ago • 8 comments

EDITED

This PR changes two things about OverrideEnvirionment:

  1. The existing __del__ method removed its argument from the override dict, and from the construction variable dict in the underlying environment ("subject"). There's "method to the madness": if the key is subsequently looked up, it would otherwise be supplied from the subject, leading to the weird outcome that you've deleted a variable, yet that variable is still in the environment you're looking at (which you don't know is a "special" OverrideEnvironment). Deleting from the subject makes that not happen, but has the side effect the variable is really deleted from the subject, which could lead to surprise elsewhere. The change is to track deletes from an OE, and use that information to make sure you don't supply a value from the subject in the OE context, but without actually modifying the subject.

  2. Many environment methods directly access/modify self._dict, the dictionary of construction variables. Environments have special methods that make them "work like a dict", by accessing _dict, but it's slightly faster to access __dict directly. In the existing code, if using an OverrideEnvironment instance, such a method will be supplied from the subject environment via proxying (__getattr) - as intended. But now self is an OE, which doesn't have a _dict attribute, so that's revectored through __getattr__ again, which supplies the attribute from the subject environment. Thus, these methods will end up taking the value from, or modifying the value in, the subject environment's _dict, even if there was already an entry in the override dict for it which should be used. The change makes OverrideEnvironment pretend it has a _dict, such that accesses go through OE's existing __getitem__ and __setitem__ methods and thus consider the override dict first.

Also in the rpm packaging tool: call Override factory method rather than directly instantiating OverrideEnvironment ("use best practices")

Fixes #4563

Contributor Checklist:

  • [X] I have created a new test or updated the unit tests to cover the new/changed functionality.
  • [X] I have updated CHANGES.txt (and read the README.rst)
  • [X] I have updated the appropriate documentation

mwichmann avatar Jul 15 '24 19:07 mwichmann

Would it be possible to prevent this by only making changes in OverrideEnvironment()? That way there'd be zero performance hit for non OverrideEnvironment() usage, and potentially none for it's parent Environment Objects envvar access?

bdbaddog avatar Jul 17 '24 22:07 bdbaddog

Not sure I'm seeing an issue? The track-deletes behavior is only in OverrideEnvironment.

mwichmann avatar Jul 18 '24 12:07 mwichmann

Not sure I'm seeing an issue? The track-deletes behavior is only in OverrideEnvironment.

Anywhere you've replaced self._dict with self['xyz]. Is a change to regular environment and potentially a performance impact for non OverrideEnvironment()

bdbaddog avatar Jul 18 '24 18:07 bdbaddog

I did most of those as a separate commit so it's easy to roll back. But... for self['xyz'], if it's a regular env, then it uses SubstitutionEnvironment's __getitem__ and __setitem__, which are simple, like:

    def __getitem__(self, key): 
        return self._dict[key]

and if it's an OverrideEnvironment it uses the one defined there that handles the multiple sources (and now deletions). So it doesn't seem to be adding anything beyond a very simple function call.

mwichmann avatar Jul 18 '24 18:07 mwichmann

I did most of those as a separate commit so it's easy to roll back. But... for self['xyz'], if it's a regular env, then it uses SubstitutionEnvironment's __getitem__ and __setitem__, which are simple, like:

    def __getitem__(self, key): 
        return self._dict[key]

and if it's an OverrideEnvironment it uses the one defined there that handles the multiple sources (and now deletions). So it doesn't seem to be adding anything beyond a very simple function call.

I'm thinking that any functions which were falling through to the Environment() class which should be checking the overrides should be implemented in Override() and not use the parent's version so it doesn't impact that class for the benefit of it's child class?

bdbaddog avatar Jul 18 '24 18:07 bdbaddog

there's potentially a lot (consider all the appends and path fiddlers and so on), it pretty much defeats the proxy idea. But... once it turned out I had to "fake" OverrideEnvironment having a _dict anyway, the old way will work, though it violates the principle that a method should work whether it runs in the context of the class it's defined in, or in the context of some derived class (or proxy) - and if it doesn't, the derived/proxy needs to implement its own.

mwichmann avatar Jul 18 '24 18:07 mwichmann

Updated - please glance at the edited summary at the top, maybe more clear now. Had to force-push due to branch merging done since the original push getting things out of skew.

mwichmann avatar Jul 24 '24 17:07 mwichmann

Updated - please glance at the edited summary at the top, maybe more clear now. Had to force-push due to branch merging done since the original push getting things out of skew.

Reviewed. This approach sounds great. I'll take a pass through the code.

I'm try to be extra cautious with this area of code as changes can have significant and (sometimes) unexpected impact.

bdbaddog avatar Jul 25 '24 03:07 bdbaddog

Squahsed and rebased on 4.8.1 as a base.

mwichmann avatar Sep 04 '24 22:09 mwichmann