jackrabbit-oak icon indicating copy to clipboard operation
jackrabbit-oak copied to clipboard

OAK-10952: improve autogenerated namespace prefixes - test with prototype

Open reschke opened this issue 8 months ago • 9 comments

PoC, hopefully illustrating the idea.

(needs more documentation and support for edge cases, final location in project to be decided)

reschke avatar Apr 17 '25 12:04 reschke

Commit-Check ✔️

github-actions[bot] avatar Apr 17 '25 12:04 github-actions[bot]

  • could the makeUpPrefixTryEvenHarder make a somewhat shorter (perhaps sha-hash) suggestion?
  • generally I think this is useful indeed, to avoid later namespace remorse. How impossible would it be to look into support for changing namespaces (I know it's a challenge)

stefan-egli avatar Apr 17 '25 13:04 stefan-egli

could the makeUpPrefixTryEvenHarder make a somewhat shorter (perhaps sha-hash) suggestion?

SHA-1 would be 160 bits, thus 20 bytes, right? That's not always less than the actual namespace name. I could however special case that case.

generally I think this is useful indeed, to avoid later namespace remorse. How impossible would it be to look into support for changing namespaces (I know it's a challenge)

That's a separate issue. I believe we already found out that it can be done using the namespace registry, although risky. The point of this is to reduce the number of cases where it actually comes up.

reschke avatar Apr 17 '25 13:04 reschke

I was thinking of something like the cca2a4f part of a cca2a4f1dbde1e0f7e337615763ac20a64e39160 git commit

stefan-egli avatar Apr 17 '25 13:04 stefan-egli

I was thinking of something like the cca2a4f part of a cca2a4f1dbde1e0f7e337615763ac20a64e39160 git commit

-> https://github.com/apache/jackrabbit-oak/pull/2237/commits/7024038f2e79c3069164bf864f2d7652e8c860a1

reschke avatar Apr 17 '25 15:04 reschke

re valid prefixes: the JCR spec allows many characters here; but checking exactly might be hard and overkill.

As valid URIs, strictly speaking, are only ASCII, we should filter out all non-ASCII. That would leaves with a short "allow" list of characters.

reschke avatar Apr 17 '25 15:04 reschke

+1, "s-1578eb4" looks nice!

stefan-egli avatar Apr 17 '25 15:04 stefan-egli

I would place this in org.apache.jackrabbit.oak.namepath.impl.GlobalNameMapper and then call this from according getOrCreateOakName(orNull) which should be used from all JCR methods potentially creating items (with expanded names). This should be imho the only place dealing with name resolving in Oak.

kwin avatar May 07 '25 18:05 kwin