operator icon indicating copy to clipboard operation
operator copied to clipboard

Scenario test fails when charm library object is not assigned to the charm

Open lucabello opened this issue 9 months ago • 1 comments

This scenario test is failing: https://github.com/canonical/opentelemetry-collector-k8s-operator/blob/c48e5314504a635e4b2c80621d80550257219868/tests/unit/test_tls_certificates.py#L110-L133

The charm, as you can see here, instantiates LokiPushApiProvider, but doesn't assign it to the charm.

This makes the test fail, and changing it to charm.provider = LokiPushApiProvider makes the test pass.

Any idea why this happens?

lucabello avatar Jun 27 '25 13:06 lucabello

Is it only the test that fails? I would have expected this to fail with a real Juju too. Handlers need to still exist when the emit happens, and in your case the object will have been deleted at the end of that method. You should get this warning but in the test and with a real Juju.

tonyandrewmeyer avatar Jun 27 '25 23:06 tonyandrewmeyer

but in the test and with a real Juju.

Both in the test and when deployed... I imagine.

dimaqq avatar Jun 30 '25 05:06 dimaqq

To clarify what Tony is saying, when an Object or CharmBase observes an event, the references to the observing method's instance is kept in a WeakValueDict, thus, after charm's init has completed, the Python GC is supposed to drop the reference to the observer.

The observers on dangling objects should be called at run-time (as in when deployed with Juju).

It could be that they are skipped with a warning and your charm's integration tests don't notice.

Or that we've got bug and dangling objects are not dropped.

Or it could be a timing issue... after all, this mechanism is there to prevent GC cycles (I think), and GC cycles are collected much slower than other unreferenced objects.

dimaqq avatar Jun 30 '25 05:06 dimaqq

It doesn't fail with a real Juju, the charm itself works just fine when deployed.

lucabello avatar Jun 30 '25 14:06 lucabello

It seems like the tests wasn't failing on integration simply because the object wasn't garbage-collected in time :( I'm starting to see some failures on a live deployment as well now

lucabello avatar Jul 02 '25 09:07 lucabello

yup definitely GC issue. Objects are meant to be hard-referenced from the base charm type. Skip that, and you're asking for trouble.

I told you so

PietroPasotti avatar Jul 03 '25 06:07 PietroPasotti

Closing since this isn't an issue with Scenario, but a limitation of Ops.

tonyandrewmeyer avatar Jul 03 '25 06:07 tonyandrewmeyer