operator It's easy to access stale relation data in the testing harness

Let's say that a charm author does the following:

Calls rel = harness.model.get_relation(...)
Calls harness.add_relation_unit, using rel.id as a parameter.
Inspects the contents of rel.units

This is a reasonable flow. The charm author wants to add a remote unit to the test, which requires knowing the relation id. To get the relation id, they do the Right Thing and call model.get_relation. This gives them a relation object which they expect to be able to re-use.

But this doesn't work. The Relation object that get_relation returns is not updated with the new unit data. The charm author must call rel = harness.model.get_relation(...) a second time, fetching a new Relation object, which now contains the expected units.

To reproduce, add the following code to operator/test/test_testing.py::TestHarness, and run tox -e unit test/test_testing.py::TestHarness::test_stale_relation_data:

    def test_stale_relation_data(self):
        harness = Harness(CharmBase, meta='''
            name: test-app
            requires:
                db:
                    interface: pgsql
            ''')
        self.addCleanup(harness.cleanup)
        harness.add_relation('db', 'postgresql')        
        harness.begin()        

        rel = harness.model.get_relation('db')
        harness.add_relation_unit(rel.id, 'postgresql/0')
        self.assertTrue(len(rel.units) > 0)

This code will fail. However, if you insert a second call to get_relation between the last two lines of code, the test will pass:

    def test_stale_relation_data(self):
        ...
        rel = harness.model.get_relation('db')
        harness.add_relation_unit(rel.id, 'postgresql/0')
        rel = harness.model.get_relation('db')        
        self.assertTrue(len(rel.units) > 0)

Jul 15 '22 15:07 pengale

The root cause is probably that the indirection in the Relation and RelationMapping classes means that we're not actually updating the original Relation object when we call add_relation_unit. In production, this behavior doesn't create problems. All objects are instantiated from data loaded into the framework at hook start. It should not be possible to modify this data. New units can only appear in a relation object if they were there at hook invocation.

For testing purposes, this is a headache, and I'm not sure how to fix it. Passing in the relation object feels messy, and refactoring the production code to be more careful about referring to original objects feels fraught and unnecessary.

Jul 15 '22 15:07 pengale

The solution - should we choose to address this - probably looks like converting the "units" attribute of the Relation class to be a property. And then we have the units method re-consult the backend to make sure it isn't stale before returning the current units. Obviously this could be an improvement to testing. But would it be okay to do this in production too - to reconsult relation-list every time someone looks at relation units? or would we want to keep a more "stable" view of that?

Jul 15 '22 15:07 rwcarlsen

@pengale - it looks like there is only one actual relation object, but its units list is only updated during construction. So any changes that affect the backend don't show.

Jul 15 '22 15:07 rwcarlsen

@rwcarlsen thanks for the clarification. I I like the idea of solving this as a property. We could even make it a property only on the testing model backend (I think), so that we didn't interfere with production at all.

I agree that a quick check of the cached data would do no harm, though Probably. :-)

Jul 15 '22 15:07 pengale

I can see where this causes confusion, however it's sort of expected: when you call harness.add_relation_unit it should add more units, and also triggers relation_joined, so you probably shouldn't use get_relation like that and expect it to update. As mentioned, in production you'll always get a new events and new view of the world.

In this case the proper way to fix is to use the relation ID returned by harness.add_relation instead of using model.get_relation at all. For example, tweaking the example above:

        rel_id = harness.add_relation('db', 'postgresql')        
        harness.begin()        

        harness.add_relation_unit(rel_id, 'postgresql/0')
        self.assertTrue(len(rel.units) > 0)

Oct 04 '23 03:10 benhoyt