Use `lxml` instead of standard library `xml` to improve namespace identification
Changes in this PR:
- Instead of hardcoding a
XtcePacketDefinition._default_namespace, read the namespace from the XTCE document.- This requires the
lxmlmodule instead of the built-inxml, since the latter'sElementdoes not have annsmapattribute.
- This requires the
- Adds new
lxmldependency to the list of requirements.- Since
XtcePacketDefinition.__init__'sxtce_documentargument might be aPath, we need a version ≥4.8, since that's when support was added forPathobjects.
- Since
- Switches to
lxmlin tests as well, for consistency.
Note that I did not yet change the version number as part of this PR. I wasn't sure about the policy for that.
Hi @cgobat, thanks for submitting some of the first PRs for this project!
Can you briefly explain why the namespace reading benefit is worth adding a new dependency to the project? Is there a reason you would ever expect the namespace to be anything other than xtce? I agree with the inclination to automate things rather than hard code them but I also want to keep the dependencies absolutely minimal.
Hi! I totally get the desire to keep external dependencies to a minimum. However, lxml is necessary/justified in this case, IMO.
The namespace key will (probably) always be xtce:, yes—but the issue is what IRI/URI that actually maps to. In XTCE 1.1, for example, the namespace is most commonly http://www.omg.org/space/xtce, while in 1.2 it is http://www.omg.org/spec/XTCE/20180204. That URI is what actually matters, not the fact that we use xtce: in the XML to refer to both, which is more or less just a convenience. The lxml module is the cleanest way I know of to do this identification to get the mapping right.
I will also add that in my experience, lxml has fairly wide adoption (due to the numerous enhancements it provides over Python's built-in xml module), and it's far from a niche or uncommon dependency.
That's my 2¢, but if you are truly averse to adding a dependency (and I do understand the hesitation, generally speaking), maybe we can try to find another way to identify the document namespace.
Nice. Thank you for writing that up! It sounds reasonable to me and your point about the mapping for the namespace is a good one.
On your question about bumping the version, don't worry about it. I actually need to write up documentation for that. Until we actually tag a commit with a release version and build it for PyPI, it doesn't really matter.
There are some tests that need to be updated to use the new xml library though. test_xtcedef.py uses xml.etree.ElementTree and that should be a quick fix I think.
I also would like to see a new test (there should have been one already but it looks like I missed it) that verifies that the namespace parsed by lxml is correct for a few different versions of XTCE.