Add c14n for node and document
There are not too many choices when it comes to "Canonical XML" implementation for rust. This PR is a compilation of ideas, approaches and test files from the following projects:
- https://gitlab.com/reinis-mazeiks/xml_c14n
- https://github.com/sparklemotion/nokogiri
Work items
- [x] add
canonicalizemethod forNodeandDocument - [x] add
Node::ancestorsmethod (I personally find it very useful when using "nokogiri", plus it help in tests a lot). Although I don't feel too strongly about, can remove if it doesn't fit. - [x] add
Node::at_xpathmethod. Yes, it is similar tofindnodes(and can ultimately be implemented usingContext), but with design from "nokogiri", I find it a lot more flexible when navigating a tree. Same as for the previous item, can update PR to remove it if necessary.
Any feedback is welcome! I don't feel particularly attached to any approach here. Please let me know how to make this PR better.
@saks interesting, there is a very high-level question I have first.
Why do you think this wrapper crate is the right place to introduce this method?
rust-libxml mostly tries to follow the libxml2 coverage, and probably should not grow much further, as a general-purpose XML utility library.
As you mention xml_c14n exists as a separate Rust crate. Maybe your work could also be packaged as a separate crate?
c14n is a part of the standard libxml2 featureset, so it would be nice to have this functionality. This MR does a bit more, but from my reading of the readme it should be covered:
Only covers a subset of libxml2 at the moment, contributions are welcome. We try to increase support with each release.
I'd like to have this for some validation in tests where I compare the output of some rust code against a canonicalized result of an xpath query. It's doable with this crate and xml_c14n, but having it as part of rust-libxml would remove the need to serialize the libxml document and then parse it as a libxml document directly afterwards.
Having independent intrerest is certainly helpful, as is pointing out that c14n can be fully thought of as exposing libxml2 functionality. So I am feeling a lot more inclined to merge the PR - I will make some time to review it in more depth soon.
Would you like to also add a review for this PR @tstenner ? Given you are an active user of the feature, you may have good visibility of any edges left to iron out.
@dginev Thanks for reconsidering it. Overall, I only had one obvious improvement to suggest and an opinion regarding the callback parameter (but then again, I don't use it, so it doesn't matter to me that much). The output buffer handling could be shorter, but that is something you know more about than me.
I've been using this branch on an off for the last half year and haven't encountered any problems with it.
I know it's a bit bigger MR and not the feature you're maintaining the wrappers for, but if you could find the time to review and merge it, I could go back to the upstream releases and make it easier for my project collaborators.
@tstenner The time is approaching for another release, so I would like to integrate the new features here.
As you have reviewed this in detail, can I summarize that you would be comfortable seeing this merged as-is? I will do a cursory review, but I wouldn't have time to thoroughly reconsider anything. If @saks has time to address the feedback so far, that would be great, otherwise we should probably refine in a follow-up PR?
I have merged here to signal strongly that we should include the feature. Further feedback is most welcome, especially in the form of follow-up PRs. I will probably submit the release on Sunday - and switch away from the xmlChar return type.