anti-xml icon indicating copy to clipboard operation
anti-xml copied to clipboard

XPath axes for Zipper

Open ncreep opened this issue 13 years ago • 21 comments

This batch of commits adds partial XPath like axes support (via pimping) to Zipper.

This is done using the ZipperAxes class and the shiftHoles method on ZIpper, see the corresponding Specs for examples.

There is no support for the child axes (although it is possible) as this can be achieved via simple selection. Though some inconsistency arises when mixing axes manipulation and selection: specifying an axes doesn't force another unselect call to produce the original group, while selection does, e.g.:

val group = ...
val zipper = group \\ ...

zipper.ancestor.unselect == group
zipper.select(...).unselect.unselect == group

Note: practically no considerations of efficiency were made, and there are probably multiple redundant traversals of the XML tree while calculating and fetching the different axes.

ncreep avatar Dec 11 '11 04:12 ncreep

Neat! Is there any reason you're pimping the methods onto Zipper, rather than just adding them directly to the class?

djspiewak avatar Dec 11 '11 17:12 djspiewak

I didn't want to add more code to Zipper as it is already quite bloated. This way, more axes can be added in the future without touching the Zipper implementation.

ncreep avatar Dec 11 '11 17:12 ncreep

Zipper is certainly a little bloated, but adding members via implicit conversion doesn't fix that, it only hides the bloat in another file. In other words, it's only fixing the problem of line-count bloat, not code bloat! The disadvantages of pimping are pretty severe (resolution conflicts, scope pollution, type inference failures, etc). So, given that pimping only improves the source organization and introduces a whole raft of problems, I would much prefer if we actually put these methods onto Zipper directly.

A sneakier option might be to move them into a trait that can be stacked onto Zipper, then provide a CanBuildFromWithZipper that produces the augmented Zipper when explicitly imported. The downside here is that the XPath axes would not be available on Zipper by default, nor would they be available on arbitrary zippers passed into code. They would only be available on zippers that are constructed where the non-default CanBuildFromWithZipper is in scope. This seems like an unnatural restriction from a user's perspective.

So, all in all, I'd really prefer it if we put the axes stuff into Zipper. One thing you could do if you really want to hide things in a separate compilation unit is break things up into traits that are inter-dependent via self types. I'm not sure this is practicable in this particular case, but it's at least a thought. Since you already implemented the axes as pimps on Zipper, it shouldn't be too hard to move them into a trait which is then inherited by Zipper (this trait could get at the Zipper instance by having an abstract toZipper method, which of course Zipper already implements).

djspiewak avatar Dec 11 '11 19:12 djspiewak

You're probably right about all the problems related to pimping, but it just feels nice to compose things that way and seems all so harmless...

I think it's a bit strange to place the axes as a super trait of Zipper as in my perception they should be composed on top of Zipper, not the other way around. Between the two solutions, the more natural one would be to add the axes directly inside Zipper. But I don't like the perspective of having N axes methods on Zipper where N >> 1.

Anyways, I'll leave it up to you to decide what's best.

Meanwhile, I''ll try to look into the idea breaking the Zipper into interdependent traits.

ncreep avatar Dec 11 '11 19:12 ncreep

If we can break up Zipper into inter-dependent traits, I think that would be ideal. Otherwise, let's plan to put the axes onto Zipper and try to factor things out down the road. You're quite right that putting the methods onto a super-trait of Zipper is…weird. They belong in a sub-trait, but that then requires all users to write Zipper with Axes all over the place, which is annoying.

Hopefully the inter-dependent traits thing works. I'll give this some thought. Maybe I can come up with a way to factor things nicely.

djspiewak avatar Dec 11 '11 19:12 djspiewak

There is no support for the child axes (although it is possible) as this can be achieved via simple selection.

I would vote for some syntactic sugar which wraps this. It seems like a reasonable thing to expect, despite its redundancy.

Though some inconsistency arises when mixing axes manipulation and selection: specifying an axes doesn't force another unselect call to produce the original group, while selection does,

I go back and forth on this one. I can see valid arguments in both directions. On the one hand, you can think of axis manipulations as being a form of selection, in which case it really should force another unselect. Of course, the problem with this is you can use axis manipulations to go up, which is where the selection metaphor breaks down. Additionally, it might seem a bit odd to "unselect" something that wasn't "selected", particularly when you're "unselecting" an upward traversal (the ancestor axis). It almost seems like axis manipulation is a potentially-orthogonal operation, allowing you to move the zipper around without pushing the unselection stack. So long as you can move around, updating things at different foci, there is no loss of expressiveness.

djspiewak avatar Jan 09 '12 02:01 djspiewak

I will try to add a children axes to the ones currently implemented, shouldn't be too difficult.

I couldn't figure out a way to make axes equivalent to selection and I just gave up. What can be done to avoid some confusion, is to add an unselectAll method on Zipper that recursively unselects up to the root. This way users won't have to keep track of the different operations. Of course, you'll have to revert to the original unselect method in case you want more granular control over the merging aspects of unselection.

ncreep avatar Jan 09 '12 18:01 ncreep

What can be done to avoid some confusion, is to add an unselectAll method on Zipper that recursively unselects up to the root. This way users won't have to keep track of the different operations. Of course, you'll have to revert to the original unselect method in case you want more granular control over the merging aspects of unselection.

In light of the fact that the ancestor axis would now be available, I think this is probably a good idea. I envisioned the single-step unselect as primarily being a way of moving back up the tree. However, ancestor makes this redundant. Also, the behavior of unselect in the presence of a deep selection really flies in the face of this original concept.

All in all, I think unselectAll makes a lot of sense. We can preserve unselect as undoing a selection, ignoring any axis-based manipulations which may have occurred in the interim.

djspiewak avatar Jan 10 '12 02:01 djspiewak

In the presence of tree modifications, ancestor does not replace unselect. In general, shifting the zipper only shows direct modifications of nodes, it doesn't do the merging step (this would lead to some ill defined behavior).

So I agree with your previous comment, axes are somewhat orthogonal to selection, and better suited to traversal without modification. Of course, the two concepts cannot be separated as you have to perform at least one selection to enable axes.

ncreep avatar Jan 10 '12 17:01 ncreep

Of course, the two concepts cannot be separated as you have to perform at least one selection to enable axes.

Maybe we should have a no-op selector on Group that allows you to get at axis functionality without selecting? I'm not sure if this makes sense, given the fact that the reconstruction has to be done via unselect.

djspiewak avatar Jan 12 '12 17:01 djspiewak

As far as I can see there, in case of tree modifications, there is no way to avoid unselect. Ignoring tree modifications, it may be possible to add axes directly on Group, but I think that it'll only make things more complicated.

There's also an alternative direction that shouldn't yield these sort of problems, which is a path combinator API. There was a nice illustration of it by @josharnold52 in the comments somewhere, but I was unable to locate it now. We can opt out for this direction, but I think it'll be less natural to use than the axes methods directly on Zipper.

If we had examples of real use cases, that would've been nice...

ncreep avatar Jan 12 '12 18:01 ncreep

Yeah, I think we should just stick with axes on Zipper, as you suggest.

djspiewak avatar Jan 12 '12 18:01 djspiewak

My bad, @josharnold52's code I was looking for, was in my inbox. Here's the relevant code:

trait PathFunction {

  def apply(g: Group): Zipper

  def directChildren: PathFunction
  def allChildren: PathFunction
  def parents: PathFunction
  def siblings: PathFunction
  def matching(sel: Selector): PathFunction
}

object PathFunction {
  def allNodes: PathFunction
  def topNodes: PathFunction
}

//finds all siblings of  elements named "author"
val search1 = PathFunction.allNodes.matching( 'author ).siblings

//finds all books with Isaac Asimov as an author
val search2 = PathFunction
                                   .allNodes.matching( 'author )
                                   .directChildren.matching( textNode("Isaac Asimov") )
                                   .parents.parents

//Apply the search to a group, and produce a zipper
val myResults = search2(group)

ncreep avatar Jan 12 '12 18:01 ncreep

Hmm, thinking about it, I really have no objection to a PathFunction style interface, so long as the direct \ selection is still supported. There are absolutely cases where composing a PathFunction separately makes a great deal of sense. I think the bit that convinces me is the fact that we apply the function to the Group, rather than the other way around.

djspiewak avatar Jan 12 '12 20:01 djspiewak

Do you suggest using PathFunction instead of axes or in addition to it?

ncreep avatar Jan 12 '12 20:01 ncreep

Do you suggest using PathFunction instead of axes or in addition to it?

I think in addition, as long as it doesn't create code duplication. I think they're useful in very separate circumstances.

djspiewak avatar Jan 12 '12 21:01 djspiewak

What's nice about path functions is that they shouldn't require any changes to the zipper. The only complication I can think of, is that there is a class invariant regarding duplicates inside the zipper, that might be tricky to enforce externally.

Anyways, I might tackle this once we're done with axes (hopefully, during the weekend I'll have time to add the methods we discussed).

ncreep avatar Jan 13 '12 01:01 ncreep

Anyways, I might tackle this once we're done with axes (hopefully, during the weekend I'll have time to add the methods we discussed).

Awesome!

djspiewak avatar Jan 13 '12 03:01 djspiewak

The only complication I can think of, is that there is a class invariant regarding duplicates inside the zipper, that might be tricky to enforce externally.

At the time, I was thinking of a PathFunction as something that selects a subset of (deep) nodes from a Group. If you take that as the definition, I don't think there's any problem with duplicates. pf.parents would select the union of the parents selected by pf, etc. I think XPath behaves similarly.

I do wonder if there's a better name than PathFunction, though. That made sense to me when it returned a list of paths, but seems awkward if it takes groups to zippers. I'd be tempted to call it Selector if that weren't already taken.

This is all cool stuff, in any case :)

josharnold52 avatar Jan 13 '12 06:01 josharnold52

I do wonder if there's a better name than PathFunction, though. That made sense to me when it returned a list of paths, but seems awkward if it takes groups to zippers. I'd be tempted to call it Selector if that weren't already taken.

"Selection" perhaps?

djspiewak avatar Jan 14 '12 02:01 djspiewak

"Selection" perhaps?

Works for me.

josharnold52 avatar Jan 16 '12 16:01 josharnold52