jackrabbit-oak icon indicating copy to clipboard operation
jackrabbit-oak copied to clipboard

OAK-10544: oak-jcr: remapping a namespace prefix leaves namespace resolver in broken state

Open reschke opened this issue 2 years ago • 14 comments
trafficstars

This is a minimal fix that actually works in that it fixes the bug, and introduces no regressions in existing tests. Whether it's the right place I'm not sure yet.

reschke avatar Nov 08 '23 18:11 reschke

SonarCloud Quality Gate failed.    Quality Gate failed

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities
Security Hotspot A 0 Security Hotspots
Code Smell D 1 Code Smell

0.0% 0.0% Coverage
0.0% 0.0% Duplication

warning The version of Java (11.0.21) you have used to run this analysis is deprecated and we will stop accepting it soon. Please update to at least Java 17. Read more here

idea Catch issues before they fail your Quality Gate with our IDE extension sonarlint SonarLint

sonarqubecloud[bot] avatar Nov 10 '23 08:11 sonarqubecloud[bot]

@mreutegg , @anchela - this needs reviews.

reschke avatar Jan 16 '24 18:01 reschke

Reviewers: please see https://issues.apache.org/jira/browse/OAK-10544?focusedCommentId=17807761&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17807761

reschke avatar Jan 17 '24 15:01 reschke

It might not introduce regressions in existing test, but I think the change is very problematic.

I see that the JCR spec says this should be the "correct" behavior, but I think we shouldn't follow the spec here in Oak. Instead, in my view the caller should not use this weird edge case. Instead, he should explicitly set the local mapping.

The reason is, you have changed "getJcrName" (a getter method that looks completely innocent) to modify state of the session. The "getJcrName" is used all over the place. This can lead to concurrency problems if the map is not a concurrent map, and other problems. For example, QueryEngine uses NO_BINDINGS = emptyMap() -- which is immutable. So if you execute a query with a non-mapped URI, then the query would fail with a strange exception (right now it doesn't).

thomasmueller avatar Jan 18 '24 08:01 thomasmueller

I understand the concern (and yes, we need tests in oak-core as well).

That said, I don't think that not fixing this is a real option. This works in Jackrabbit classic and causes real problems in practice. The "caller" is just trying to import a content packageand has zero information about what causes the NS prefix clash.

reschke avatar Jan 18 '24 08:01 reschke

FWIW, "getJcrName()" can not possibly work in this case unless it can add a mapping. This needs to be either permantent (repository registry) or temporarily (session mappings). Of these choices, the second looks less intrusive to me.

reschke avatar Jan 18 '24 09:01 reschke

Note that this restores a behavior we had in Oak as well; see https://github.com/apache/jackrabbit-oak/commit/250bcb5ba4e0e8841d7c4d0378ba7ff85f10ded3 (10 years ago).

reschke avatar Jan 18 '24 09:01 reschke

So if you execute a query with a non-mapped URI, then the query would fail with a strange exception (right now it doesn't).

Well. The choices are:

  1. Produce a broken name that will not work (current behavior).
  2. Produce a broken name that will not work and potentially log that a s problem (WARN?)
  3. Produce a name that does work by registering the prefix in the session.

My preference is a proper fix; I understand your concerns and will have a closer look, also adding tests closer to the code that we want do modify.

reschke avatar Jan 18 '24 10:01 reschke

https://github.com/apache/jackrabbit-oak/pull/1199/commits/bf485dfdf224554b51d8cfe50e067238b9e70fde addresses the "immutable" use case (doc, catch, test coverage).

Regarding the concurrency concern: sessions are not thread-safe, so this should not be critical.

The current PR addresses the user-visible problem with minimal surgery. We may want to discuss making this more pretty (for instance, by providing the prefix setter call back as additional argument), but my preference would be to do that in a separate issue.

reschke avatar Jan 18 '24 13:01 reschke

Basically, the risk is that if we change this area, then existing applications (that rely on the current behavior for some reason) might be broken.

thomasmueller avatar Jan 18 '24 13:01 thomasmueller

Currently we loop until we find a "free" prefix (ax2, ax3, ax4,...).

An alternative solution might be: instead of looping, we could calculate the hash code of the URI (e.g. ax_489acc8493f073ab). That would make prefixes very large, but we would always get the same prefix.

thomasmueller avatar Jan 18 '24 13:01 thomasmueller

Basically, the risk is that if we change this area, then existing applications (that rely on the current behavior for some reason) might be broken.

They would rely on the result of "getJcrName()" to be broken :-). But yes, I see your point. The current PR is minimal in that the API signature is unchanged, but modified in the sense of "if you supply a modifiable map, we may actually modify it - your choice". But of course we can instead extend the API to ensure that only the one place we want to affect right now will be affected.

An alternative solution might be: instead of looping, we could calculate the hash code of the URI (e.g. ax_489acc8493f073ab). That would make prefixes very large, but we would always get the same prefix.

More predictable prefixes would be good. Consistency could also be achieved by exposing the prefix minting code in oak-core, so that oak-jcr gets the same prefix. At the end of the day, we always will have coupling; the difference is only whether it's explicit or implicit.

FWIW: minting prefixes could also be made smarter by special-casing widely-used namespace names, such as in use in XMP/EXIF etc.

reschke avatar Jan 18 '24 13:01 reschke

@reschke May I also suggest a feature flag to enable the behavior? Maybe it only ever gets enabled in one environment, but that may be enough to reduce the pressure to broaden the scope of the change.

adamcin avatar Jan 18 '24 15:01 adamcin

@mreutegg - that's an interesting suggestion; I'll try this with the test in oak-jcr and the proposed tests in vlt

UPDATE: indeed it does; and this fix doesn't cross core/jcr boundaries; will update the PR.

reschke avatar Jan 23 '24 15:01 reschke