eclipse.platform.ui
eclipse.platform.ui copied to clipboard
[WIP] [E4 Xpath] Replace apache.commons.jxpath by javax.xml.xpath
As said in https://github.com/eclipse-platform/eclipse.platform.ui/issues/423#issuecomment-2355147118 this is the current state of my stalled work to migrate E4 Xpath off the old and unmaintained apache.commons.jxpath library.
The basic idea is to provide a org.w3c.dom.Element view/wrapper for an EObject so that an javax.xml.xpath.Xpath can operate on it.
As mentioned this is heavily work in progres, not yet functional and a lot has to be cleaned up before this can be used (it contains a lot of try out code).
@ptziegler if you or anybody else like to take this over and complete it please feel free. I also would find this an interesting topic, but I have currently no time to work on this. But if you don't have time either, I might continue this by myself in the future.
I have also extracted some minor improvements that can be applied now already in https://github.com/eclipse-platform/eclipse.platform.ui/pull/2289.
Test Results
1 818 files - 3 1 818 suites - 3 1h 35m 36s ⏱️ + 7m 43s 7 719 tests - 4 7 491 ✅ - 4 228 💤 ±0 0 ❌ ±0 24 318 runs - 12 23 569 ✅ - 12 749 💤 ±0 0 ❌ ±0
Results for commit cc7ae8a6. ± Comparison against base commit c9f129d3.
This pull request removes 4 tests.
org.eclipse.e4.emf.xpath.test.EDynamicPropertyHandlerTest ‑ test_getProperty
org.eclipse.e4.emf.xpath.test.EDynamicPropertyHandlerTest ‑ test_getPropertyNames
org.eclipse.e4.emf.xpath.test.EDynamicPropertyHandlerTest ‑ test_getProperty_When_UnknownProperty_Expect_NullResult
org.eclipse.e4.emf.xpath.test.EDynamicPropertyHandlerTest ‑ test_setProperty
:recycle: This comment has been updated with latest results.
I played around with this draft and I think the initial approach can work, with the main obstacles being:
- jxpath has the application as root, while xpath has the document. This leads to weird situation where the XPath "/" has to be translated to "/application/".
- The handling of the parent context is something that needs to be extensively tested. Also whether e.g. "/" is now the root of the parent context. I'm also not sure how this would be encoded in XML. I.e. whether both documents need to be merged or whether the "child" application needs to be appended to the "parent" application or if it's something completely different.
How should we proceed here? I don't think I can push directly to your branch. So should I create a separate branch where I do my own development on?
I played around with this draft and I think the initial approach can work,
Awesome!
with the main obstacles being:
- jxpath has the application as root, while xpath has the document. This leads to weird situation where the XPath "/" has to be translated to "/application/".
Couldn't this be fixed by adding a placeholder/virtual/dummy document?
- The handling of the parent context is something that needs to be extensively tested. Also whether e.g. "/" is now the root of the parent context. I'm also not sure how this would be encoded in XML. I.e. whether both documents need to be merged or whether the "child" application needs to be appended to the "parent" application or if it's something completely different.
I cannot say much about this a.t.m.
How should we proceed here? I don't think I can push directly to your branch. So should I create a separate branch where I do my own development on?
Yes you have to create your own branch and PR, but you could add a link to this. If you have created it, this can be closed.
I'm also not sure how this would be encoded in XML. I.e. whether both documents need to be merged or whether the "child" application needs to be appended to the "parent" application or if it's something completely different.
Can you explain what exactly is the problem/question?
jxpath has the application as root, while xpath has the document. This leads to weird situation where the XPath "/" has to be translated to "/application/"
I would expect that application is the document element or do I understand the problem wrong?
Can you explain what exactly is the problem/question?
I'm simply not sure how the parent context is handled in jxpath. But until we can properly ready the current context, this doesn't have a very high priority on my side.
I would expect that application is the document element or do I understand the problem wrong?
Given the following XML document:
<foo>
<bar/>
<bar/>
<bar/>
</foo>
When converted to a Java document, you get the following object structure:
- Document
- Element (foo)
- Element (bar)
- Element (bar)
- Element (bar)
Evaluating the XPath "/" on any node returns Document and not Element(foo).
If I understand right we already implement the DOM API here (maybe something better placed at EMF directly? @merks ?) so can't Application implement Document + Element here and simplify return it as document and the root element (might be a bit counterintuitive but probably works).
Stop to ask, why are there so many alternatives to DOM? (Because it's horrible?!)
Goodness knows why folks could not have just use EMF's support for paths?
- org.eclipse.emf.ecore.resource.impl.ResourceImpl.getEObject(List<String>)
- org.eclipse.emf.ecore.InternalEObject.eObjectForURIFragmentSegment(String)
Probably wasn't pretty enough? Not standard enough? Note powerful enough? Best to hide EMFness?
In any case, no one ever asked me for advice or suggestions, so I have no clue how it was necessary to have the full power of XPath available to reference an object when there are far simpler mechanisms available for doing just that.
I definitely don't want to push this problem down into EMF. People have asked for many things, but never this thing.
Stop to ask, why are there so many alternatives to DOM? (Because it's horrible?!)
EMF is a DOM as well, it just don't implement the (XML) DOM API ;-)
Probably wasn't pretty enough? Not standard enough? Note powerful enough? Best to hide EMFness?
I have no clue but can only assume because the e4 xmi is actually an XML document and XPath is the standard for XML .. anyways Xpath itself do not mandates to use DOM, it supports other (xml) representations as well, thats why I previously mentioned that we probably just need to copy the parser part, because in the end we only need to parse an Xpath Expression and map it to the (EMF) DOM thats what actually is done as of today.
Sadly I have found little to no documentation on this feature so its quite hard to guess what must be supported and how exactly it is mapped or what where the reasons for a design decision. Also the UI for this is really bare....
Probably wasn't pretty enough? Not standard enough? Note powerful enough? Best to hide EMFness?
I'm pretty sure it's just because XPath is standard and popular enough to assume most developers will feel comfortable enough with it for this case.
I'm wondering whether the full power of XPath is required here. Looking at all found instances on GitHub https://github.com/search?q=path%253A*.e4xmi+xpath&type=code&ref=advsearch , we can see only a few basic patterns: xpath:/ (root, 95% of occurrences), xpath://mainMenu (select all), xpath:/mainMenu/child[1], xpath://*[@elementId='fragment.contributedMenu1' or @elementId='fragment.contributedMenu2'] (select by attribute). We could consider just sticking to supporting that list.
If EMF already support well an XPath-like syntax to select node and this syntax is xpath enough to expect most users wouldn't need to change their extensions to get the same node selected, we could consider just dropping XPath and adopting the EMF way. If the is another close but not directly complatible syntax supported natively in EMF, we could consider converting XPath to this syntax. @merks What do you think is possible/best here to rely on more native EMF features?
The XPath library being used has the benefit that it operates on any DOM-like structure. The built-in XPath support works only on org.w3c.dom. That's simply nasty such that one must try to serialize the model to a DOM and keep a mapping to work your way back. I haven't looked at the details of prototype. It's not clear to me that cloning jxpath and deleting the unused content would not be the easier approach. Either way, there is a whole whack of complex crap that needs to be maintained...
I think at this point, we are stuck needing to support XPath expressions exactly as they are current used, so we must parse them and evaluate them somehow. Alternative approaches are water under the bridge that can't be pushed back upstream. (I bring it up merely because I do not want EMF, i.e., me personally, to burdened with this, but I'm happy to help the Platform wherever I can.)
It's not clear to me that cloning jxpath and deleting the unused content would not be the easier approach. Either way, there is a whole whack of complex crap that needs to be maintained...
I believe both approaches are feasible but at least in the long term, we should try to remove the reference JXPath. But given that this will take quite a lot of effort, it also makes sense to simply fork the JXPath project until then. Perhaps we also come to the conclusion that the XPath approach doesn't work out as well as we had hoped, in which case we still have the alternative to fall back on. I'll try to draft a PR for the fork, separately from the PR for using XPath.
FYI, in Orbit I build axis1 (horrible but BIRT uses it) from source and publish it to repo.eclipse.org so that we can use BND to create an OSGi build from it as if it were published to Maven central:
https://github.com/eclipse-orbit/orbit-simrel/blob/main/maven-deploy/MavenAxis.jenkinsfile
We could do that with jxpath, or a fork of jxpath, perhaps a fork where only the "CVE" functionality is disabled so that there really isn't much to maintain at all, and it could be rebased on newer versions of jxpath in the future.
Just a thought...
We could do that with jxpath, or a fork of jxpath, perhaps a fork where only the "CVE" functionality is disabled so that there really isn't much to maintain at all, and it could be rebased on newer versions of jxpath in the future.
That's effectively the case with org.apache.commons.jxpath v1.3.0.v200911051830 that was used previously. Because this plugin doesn't import e.g. the javax.servlet packages, all of the "remote execution" CVEs are effectively irrelevant, as the application would already fail with an exception, when trying to initialize the servlets.
Because this plugin doesn't import e.g. the javax.servlet packages, all of the "remote execution" CVEs are effectively irrelevant, as the application would already fail with an exception, when trying to initialize the servlets.
As jxpath does not run inside a servelt / EE Container, they are effectively irrelevant for where we use that as well ... ;-)
In any case, embedding the code seem more suitable than building something that is similar but named the same as an official artifact.
Today I stumbled upon the jaxen library, which says:
The Jaxen XPath Engine for Java
[...]
It is also possible to write adapters that treat non-XML trees such as compiled Java byte code
or Java beans as XML, thus enabling you to query these trees with XPath too.
It sounds like this maybe could be an alternative for jXpath. It isn't very active either but it's latest release is only two years old. And the good thing is, it's already in simrel-Orbit and seems to have no extra dependencies https://github.com/eclipse-orbit/orbit-simrel/blob/75cc23701b2417f530efd3ce51763aef09c5a206/maven-osgi/tp/Maven.target#L678-L680
Today I stumbled upon the jaxen library, which says:
The Jaxen XPath Engine for Java [...] It is also possible to write adapters that treat non-XML trees such as compiled Java byte code or Java beans as XML, thus enabling you to query these trees with XPath too.It sounds like this maybe could be an alternative for jXpath. It isn't very active either but it's latest release is only two years old. And the good thing is, it's already in simrel-Orbit and seems to have no extra dependencies https://github.com/eclipse-orbit/orbit-simrel/blob/75cc23701b2417f530efd3ce51763aef09c5a206/maven-osgi/tp/Maven.target#L678-L680
I gave it a quick try, but I don't believe it works as well as it should... For example, you can't "skip" nodes, so expressions like "children/mainMenu" work, but "//mainMenu" doesn't. Getting the current object via "/" also doesn't work...
Is there progress here? It would be nice to get rid of jxpath and it's CVEs this cycle.
This pull request changes some projects for the first time in this development cycle. Therefore the following files need a version increment:
bundles/org.eclipse.e4.emf.xpath/META-INF/MANIFEST.MF
An additional commit containing all the necessary changes was pushed to the top of this PR's branch. To obtain these changes (for example if you want to push more changes) either fetch from your fork or apply the git patch.
Git patch
From b6e9b024238621c74b3996faaac3f3e1d438e66d Mon Sep 17 00:00:00 2001
From: Eclipse Platform Bot <[email protected]>
Date: Tue, 14 Jan 2025 05:51:29 +0000
Subject: [PATCH] Version bump(s) for 4.35 stream
diff --git a/bundles/org.eclipse.e4.emf.xpath/META-INF/MANIFEST.MF b/bundles/org.eclipse.e4.emf.xpath/META-INF/MANIFEST.MF
index 74cc46b704..0fd771831e 100644
--- a/bundles/org.eclipse.e4.emf.xpath/META-INF/MANIFEST.MF
+++ b/bundles/org.eclipse.e4.emf.xpath/META-INF/MANIFEST.MF
@@ -2,7 +2,7 @@ Manifest-Version: 1.0
Bundle-ManifestVersion: 2
Bundle-Name: %Bundle-Name
Bundle-SymbolicName: org.eclipse.e4.emf.xpath
-Bundle-Version: 0.5.0.qualifier
+Bundle-Version: 0.5.100.qualifier
Bundle-RequiredExecutionEnvironment: JavaSE-17
Require-Bundle: org.eclipse.emf.ecore;bundle-version="2.35.0",
org.eclipse.core.runtime;bundle-version="3.29.0"
--
2.47.1
Further information are available in Common Build Issues - Missing version increments.
Is there progress here? It would be nice to get rid of jxpath and it's CVEs this cycle.
Sadly not, at least from my side. A lot of stuff has been piling up over the past months and I fear I won't be able to work on this anytime soon. 😦
@akurtakov @HannesWell @ptziegler I have now just taken the opportunity to try the path to embed jxpath here:
- https://github.com/eclipse-platform/eclipse.platform.ui/pull/2691
I have now added another workaround to reconstruct and return a list-value in XPathContext.getValue() if the specified xpath has a many-reference as terminal so that the last test-case also passes without changes:
https://github.com/eclipse-platform/eclipse.platform.ui/blob/c9f129d39bcd6e77a70d80421a18de7c7c768bf2/tests/org.eclipse.e4.emf.xpath.test/src/org/eclipse/e4/emf/xpath/test/ExampleQueriesApplicationTest.java#L105-L113
With this all tests pass and I think this is ready. Although I'm still not sure if the parent-context handling has any real use-case as explained in: https://github.com/eclipse-platform/eclipse.platform.ui/pull/2290#discussion_r1948233504 I'll create another PR suggesting to deprecate that and other parts of this API for removal, where we could further discuss .
The build is completely green now and all tests pass.
If no objections are raised, I'll submit this PR this (European) evening.
No objections. Why not push the big green button now?
Sooner is better. I would even trigger a new build immediately.
https://ci.eclipse.org/releng/job/Builds/job/I-build-4.35/123/ triggered
Thanks 🙏
Apologies for the late response. I really like just how much cleaner this looks now, so no objections from my side, either.
No objections. Why not push the big green button now?
I just wanted to give other the opportunity for a few more hours to express their opinion. But I'm also fine to have it sooner. :)
Apologies for the late response. I really like just how much cleaner this looks now, so no objections from my side, either.
Great. :)