space_packet_parser icon indicating copy to clipboard operation
space_packet_parser copied to clipboard

Use `lxml` instead of standard library `xml` to improve namespace identification

Open cgobat opened this issue 1 year ago • 5 comments

Changes in this PR:

  • Instead of hardcoding a XtcePacketDefinition._default_namespace, read the namespace from the XTCE document.
    • This requires the lxml module instead of the built-in xml, since the latter's Element does not have an nsmap attribute.
  • Adds new lxml dependency to the list of requirements.
  • Switches to lxml in tests as well, for consistency.

cgobat avatar Jan 24 '24 19:01 cgobat

Note that I did not yet change the version number as part of this PR. I wasn't sure about the policy for that.

cgobat avatar Jan 24 '24 19:01 cgobat

Hi @cgobat, thanks for submitting some of the first PRs for this project!

Can you briefly explain why the namespace reading benefit is worth adding a new dependency to the project? Is there a reason you would ever expect the namespace to be anything other than xtce? I agree with the inclination to automate things rather than hard code them but I also want to keep the dependencies absolutely minimal.

medley56 avatar Feb 07 '24 17:02 medley56

Hi! I totally get the desire to keep external dependencies to a minimum. However, lxml is necessary/justified in this case, IMO.

The namespace key will (probably) always be xtce:, yes—but the issue is what IRI/URI that actually maps to. In XTCE 1.1, for example, the namespace is most commonly http://www.omg.org/space/xtce, while in 1.2 it is http://www.omg.org/spec/XTCE/20180204. That URI is what actually matters, not the fact that we use xtce: in the XML to refer to both, which is more or less just a convenience. The lxml module is the cleanest way I know of to do this identification to get the mapping right.

I will also add that in my experience, lxml has fairly wide adoption (due to the numerous enhancements it provides over Python's built-in xml module), and it's far from a niche or uncommon dependency.

That's my 2¢, but if you are truly averse to adding a dependency (and I do understand the hesitation, generally speaking), maybe we can try to find another way to identify the document namespace.

cgobat avatar Feb 07 '24 18:02 cgobat

Nice. Thank you for writing that up! It sounds reasonable to me and your point about the mapping for the namespace is a good one.

On your question about bumping the version, don't worry about it. I actually need to write up documentation for that. Until we actually tag a commit with a release version and build it for PyPI, it doesn't really matter.

medley56 avatar Feb 08 '24 18:02 medley56

There are some tests that need to be updated to use the new xml library though. test_xtcedef.py uses xml.etree.ElementTree and that should be a quick fix I think.

I also would like to see a new test (there should have been one already but it looks like I missed it) that verifies that the namespace parsed by lxml is correct for a few different versions of XTCE.

medley56 avatar Feb 08 '24 18:02 medley56