anti-xml
anti-xml copied to clipboard
XPath axes for Zipper
This batch of commits adds partial XPath like axes support (via pimping) to Zipper.
This is done using the ZipperAxes class and the shiftHoles method on ZIpper, see the corresponding Specs for examples.
There is no support for the child axes (although it is possible) as this can be achieved via simple selection.
Though some inconsistency arises when mixing axes manipulation and selection: specifying an axes doesn't force another unselect
call to produce the original group, while selection does, e.g.:
val group = ...
val zipper = group \\ ...
zipper.ancestor.unselect == group
zipper.select(...).unselect.unselect == group
Note: practically no considerations of efficiency were made, and there are probably multiple redundant traversals of the XML tree while calculating and fetching the different axes.
Neat! Is there any reason you're pimping the methods onto Zipper
, rather than just adding them directly to the class?
I didn't want to add more code to Zipper
as it is already quite bloated.
This way, more axes can be added in the future without touching the Zipper
implementation.
Zipper
is certainly a little bloated, but adding members via implicit conversion doesn't fix that, it only hides the bloat in another file. In other words, it's only fixing the problem of line-count bloat, not code bloat! The disadvantages of pimping are pretty severe (resolution conflicts, scope pollution, type inference failures, etc). So, given that pimping only improves the source organization and introduces a whole raft of problems, I would much prefer if we actually put these methods onto Zipper
directly.
A sneakier option might be to move them into a trait that can be stacked onto Zipper
, then provide a CanBuildFromWithZipper
that produces the augmented Zipper
when explicitly imported. The downside here is that the XPath axes would not be available on Zipper
by default, nor would they be available on arbitrary zippers passed into code. They would only be available on zippers that are constructed where the non-default CanBuildFromWithZipper
is in scope. This seems like an unnatural restriction from a user's perspective.
So, all in all, I'd really prefer it if we put the axes stuff into Zipper
. One thing you could do if you really want to hide things in a separate compilation unit is break things up into traits that are inter-dependent via self types. I'm not sure this is practicable in this particular case, but it's at least a thought. Since you already implemented the axes as pimps on Zipper
, it shouldn't be too hard to move them into a trait which is then inherited by Zipper
(this trait could get at the Zipper
instance by having an abstract toZipper
method, which of course Zipper
already implements).
You're probably right about all the problems related to pimping, but it just feels nice to compose things that way and seems all so harmless...
I think it's a bit strange to place the axes as a super trait of Zipper
as in my perception they should be composed on top of Zipper
, not the other way around.
Between the two solutions, the more natural one would be to add the axes directly inside Zipper
. But I don't like the perspective of having N
axes methods on Zipper
where N >> 1
.
Anyways, I'll leave it up to you to decide what's best.
Meanwhile, I''ll try to look into the idea breaking the Zipper
into interdependent traits.
If we can break up Zipper
into inter-dependent traits, I think that would be ideal. Otherwise, let's plan to put the axes onto Zipper
and try to factor things out down the road. You're quite right that putting the methods onto a super-trait of Zipper
is…weird. They belong in a sub-trait, but that then requires all users to write Zipper with Axes
all over the place, which is annoying.
Hopefully the inter-dependent traits thing works. I'll give this some thought. Maybe I can come up with a way to factor things nicely.
There is no support for the child axes (although it is possible) as this can be achieved via simple selection.
I would vote for some syntactic sugar which wraps this. It seems like a reasonable thing to expect, despite its redundancy.
Though some inconsistency arises when mixing axes manipulation and selection: specifying an axes doesn't force another
unselect
call to produce the original group, while selection does,
I go back and forth on this one. I can see valid arguments in both directions. On the one hand, you can think of axis manipulations as being a form of selection, in which case it really should force another unselect
. Of course, the problem with this is you can use axis manipulations to go up, which is where the selection metaphor breaks down. Additionally, it might seem a bit odd to "unselect" something that wasn't "selected", particularly when you're "unselecting" an upward traversal (the ancestor axis). It almost seems like axis manipulation is a potentially-orthogonal operation, allowing you to move the zipper around without pushing the unselection stack. So long as you can move around, updating things at different foci, there is no loss of expressiveness.
I will try to add a children axes to the ones currently implemented, shouldn't be too difficult.
I couldn't figure out a way to make axes equivalent to selection and I just gave up.
What can be done to avoid some confusion, is to add an unselectAll
method on Zipper
that recursively unselects up to the root. This way users won't have to keep track of the different operations.
Of course, you'll have to revert to the original unselect
method in case you want more granular control over the merging aspects of unselection.
What can be done to avoid some confusion, is to add an
unselectAll
method onZipper
that recursively unselects up to the root. This way users won't have to keep track of the different operations. Of course, you'll have to revert to the originalunselect
method in case you want more granular control over the merging aspects of unselection.
In light of the fact that the ancestor axis would now be available, I think this is probably a good idea. I envisioned the single-step unselect
as primarily being a way of moving back up the tree. However, ancestor
makes this redundant. Also, the behavior of unselect
in the presence of a deep selection really flies in the face of this original concept.
All in all, I think unselectAll
makes a lot of sense. We can preserve unselect
as undoing a selection, ignoring any axis-based manipulations which may have occurred in the interim.
In the presence of tree modifications, ancestor
does not replace unselect
.
In general, shifting the zipper only shows direct modifications of nodes, it doesn't do the merging step (this would lead to some ill defined behavior).
So I agree with your previous comment, axes are somewhat orthogonal to selection, and better suited to traversal without modification. Of course, the two concepts cannot be separated as you have to perform at least one selection to enable axes.
Of course, the two concepts cannot be separated as you have to perform at least one selection to enable axes.
Maybe we should have a no-op selector on Group
that allows you to get at axis functionality without selecting? I'm not sure if this makes sense, given the fact that the reconstruction has to be done via unselect
.
As far as I can see there, in case of tree modifications, there is no way to avoid unselect
.
Ignoring tree modifications, it may be possible to add axes directly on Group
, but I think that it'll only make things more complicated.
There's also an alternative direction that shouldn't yield these sort of problems, which is a path combinator API. There was a nice illustration of it by @josharnold52 in the comments somewhere, but I was unable to locate it now.
We can opt out for this direction, but I think it'll be less natural to use than the axes methods directly on Zipper
.
If we had examples of real use cases, that would've been nice...
Yeah, I think we should just stick with axes on Zipper
, as you suggest.
My bad, @josharnold52's code I was looking for, was in my inbox. Here's the relevant code:
trait PathFunction {
def apply(g: Group): Zipper
def directChildren: PathFunction
def allChildren: PathFunction
def parents: PathFunction
def siblings: PathFunction
def matching(sel: Selector): PathFunction
}
object PathFunction {
def allNodes: PathFunction
def topNodes: PathFunction
}
//finds all siblings of elements named "author"
val search1 = PathFunction.allNodes.matching( 'author ).siblings
//finds all books with Isaac Asimov as an author
val search2 = PathFunction
.allNodes.matching( 'author )
.directChildren.matching( textNode("Isaac Asimov") )
.parents.parents
//Apply the search to a group, and produce a zipper
val myResults = search2(group)
Hmm, thinking about it, I really have no objection to a PathFunction
style interface, so long as the direct \
selection is still supported. There are absolutely cases where composing a PathFunction
separately makes a great deal of sense. I think the bit that convinces me is the fact that we apply the function to the Group
, rather than the other way around.
Do you suggest using PathFunction
instead of axes or in addition to it?
Do you suggest using
PathFunction
instead of axes or in addition to it?
I think in addition, as long as it doesn't create code duplication. I think they're useful in very separate circumstances.
What's nice about path functions is that they shouldn't require any changes to the zipper. The only complication I can think of, is that there is a class invariant regarding duplicates inside the zipper, that might be tricky to enforce externally.
Anyways, I might tackle this once we're done with axes (hopefully, during the weekend I'll have time to add the methods we discussed).
Anyways, I might tackle this once we're done with axes (hopefully, during the weekend I'll have time to add the methods we discussed).
Awesome!
The only complication I can think of, is that there is a class invariant regarding duplicates inside the zipper, that might be tricky to enforce externally.
At the time, I was thinking of a PathFunction
as something that selects a subset of (deep) nodes from a Group. If you take that as the definition, I don't think there's any problem with duplicates. pf.parents
would select the union of the parents selected by pf, etc. I think XPath behaves similarly.
I do wonder if there's a better name than PathFunction
, though. That made sense to me when it returned a list of paths, but seems awkward if it takes groups to zippers. I'd be tempted to call it Selector
if that weren't already taken.
This is all cool stuff, in any case :)
I do wonder if there's a better name than
PathFunction
, though. That made sense to me when it returned a list of paths, but seems awkward if it takes groups to zippers. I'd be tempted to call itSelector
if that weren't already taken.
"Selection" perhaps?
"Selection" perhaps?
Works for me.