alga Implement hasVertexP and hasEdgeP

This is an initial PR for hasVertexP and hasEdgeP - predicate variants of hasVertex and hasEdge (reimplemented in terms of the predicate versions). (the latter of which will be useful for some rewriting work I hope to build on top of Algebra.Graph).

I've only looked at the code so far; on my todo list is at least

[ ] check docstrings are appropriate
[ ] write tests for hasEdgeP explicitly (but probably both)
[x] double-check the SPECIALISE pragmas are (still) appropriate

Oct 26 '18 15:10 jmtd

@jmtd Cool, thank you! I like how nicely it all worked out.

There seem to be a performance regression for hasVertex and possibly hasEdge too:

hasEdge: 1.17 (bad)
hasVertex: 1.66 (bad)

While 1.17 may just be noise, 1.66 feels like a proper regression, likely due to the missing SPECIALISE pragma. @nobrakal Do you agree?

Performace regression is reported by our CI infrastructure on every commit, so if you push any fixes you should be able to see the results in the DOBENCH="Graph" Travis job. For example:

https://travis-ci.org/snowleopard/alga/jobs/446759541

Further things to do:

[ ] We probably want to return the actual matching vertices and edges from these functions. What about calling them findVertex and findEdge and modifying the type signatures slightly?
```
findVertex :: (a -> Bool) -> Graph a -> [a]
findEdge :: (a -> Bool) -> (a -> Bool) -> Graph a -> [(a, a)]
```
Then hasVertex could simply check if the list is empty. But I'm not sure if this can be implemented without sacrificing performance.
[ ] Implement this functionality in Algebra.Graph.NonEmpty.

Oct 26 '18 17:10 snowleopard

Hi,

While 1.17 may just be noise, 1.66 feels like a proper regression

The two are proper regression in this case :)

About the SPECIALIZE pragma: It is here only to optimize out a class constraint, so it is still needed for both hasVertex and hasEdge (and adding it to hasVertexP or hasEdgeP will do nothing, since these functions does not have any class constraint).

The problem is that hasVertex contains a reference to an other function (hasVertexP) which is currently not inlined, so it add a level of indirection in the code of hasVertex

The solution is to add an INLINE pragma to both hasVertexP and hasEdgeP :) (and it is anyway a good idea to inline these !).

Then hasVertex could simply check if the list is empty. But I'm not sure if this can be implemented without sacrificing performance.

Maybe lazyness can help preventing a performance drop, but I am not sure.

Oct 29 '18 08:10 nobrakal

Adding INLINE pragmas seems to have fixed the regression.

(There is still this strange result hasEdge: 0.69 (good), which we believe is a bug.)

@jmtd Could you try the following?

{-# INLINE [1] findVertices #-}
findVertices :: (a -> Bool) -> Graph a -> [a]
findVertices p = foldg [] [ x | p x ] (++) (++)

hasVertex :: Eq a => a -> Graph a -> Bool
hasVertex x = not . null $ findVertices (== x)

I'm interested to see whether this will be as fast as your implementation via hasVertexP.

Oct 31 '18 00:10 snowleopard

Hi @snowleopard

findVertices p = foldg [] [ x | p x ] (++) (++)

I think there must be a typo in the list comprehension here, my GHCI doesn't like it either, or perhaps I need to load a language extension?

Nov 01 '18 19:11 jmtd

@jmtd Oh, I'm sorry, I meant:

findVertices p = foldg [] (\x -> [ x | p x ]) (++) (++)

Nov 02 '18 02:11 snowleopard

NP. I've just pushed that, let's see what the performance is like.

I'd quite like to reproduce the performance measurement stuff locally (to avoid round-tripping github PRs to test things). I should perhaps also take a look at Stack if I'm spring-cleaning my setup.

I'm working on a findEdges now

Nov 02 '18 10:11 jmtd

As an aside, would you like PRs to add things like {-# LANGUAGE TupleSections #-} where necessary — I see this is already specified in the cabal file, but is still needed (I think) for e.g. loading the local copy into ghci w/o also specifying it on the command line

Nov 02 '18 13:11 jmtd

travis and appveyor results for the findVertices change only (ab78e059) show a regression

hasVertex: 1.55 (bad)

I'm going to push an initial findEdges now to see how that performs.

(this patch series will need some rebasing before it's mergeable)

Nov 02 '18 13:11 jmtd

Results for findEdges as of 5ed33b63284afb275d7584d0203d0b4327ecc02e

hasEdge: 1.35 (bad) hasVertex: 2.03 (bad)

Nov 02 '18 15:11 jmtd

@jmtd Thanks for giving it a try. I suggest we proceed as follows.

Let's keep the Algebra.Graph module as is, but instead add new functionality to the generic module that defines the ToGraph type class. All we need to implement both findVertices and hasVertexP is the foldg method -- we do not need to know the graph itself:

findVertices :: ToGraph g => (ToVertex g -> Bool) -> g -> [ToVertex g]
findVertices p = foldg [] (\x -> [ x | p x ]) (++) (++)

With this approach other instances of the ToGraph class can use these functions too.

Also, I'd like to understand your use case a bit better: how do you plan to use these functions? Merely checking if a vertex with the right properties exist (using hasVertexP) is probably not too useful. Same about hasEdgeP. Also, the function findVertices could probably be implemented more directly as a combination of Set.filter and vertexSet.

My understanding is that you actually want to write some sort of graph transformation function that, for example, turns merges all edges that satisfy a given property into a vertex. Could you provide some more details? Otherwise, I worry that we might end up providing too many functions with similar looking functionality, hence confusing Alga users: hasVertex, hasVertexP, findVertices, vertexSet.

Nov 04 '18 18:11 snowleopard

@snowleopard wrote:

Also, I'd like to understand your use case a bit better: how do you plan to use these functions? Merely checking if a vertex with the right properties exist (using hasVertexP) is probably not too useful. … My understanding is that you actually want to write some sort of graph transformation function that, for example, turns merges all edges that satisfy a given property into a vertex.

That's right. Here's an example of a graph rewrite rule, coded in a string, using composition to represent the flow of the stream data

streamMap f . streamMap g = streamMap (f . g)

We have stream graphs coded up using Graph and and ADT like the following

data StreamVertex = StreamVertex                                                                                                                                                   
    { vertexId   :: Int            -- necessary to stop unifying distinct nodes which are otherwise identical
    , operator   :: StreamOperator -- data StreamOperator = Map | Filter | Expand…                                                         
    , parameters :: [String]       -- strings of code representing operator arguments
    , intype     :: String         -- capturing the type flowing into this node                                                                        
    } deriving (Eq)

What I am hoping to do is encode the stream rules using a similar ADT to the above instead of strings (so probably make the parameters argument a type variable and re-use the above), then apply rules to instances of Graph StreamVertex

All the rules I am working with at the moment are derived from analysing operators pair-wise, and generally swap the order of two operators, but also change their parameters; so from a Graph POV adding and removing both edges and nodes. More complex rewrite rules could involve introducing more nodes and consequently edges. We don't currently label edges, but direction is important.

Nov 06 '18 12:11 jmtd

@jmtd Thanks! Let me think about this for a few days. I'm travelling until 11 November, so it's a bit difficult to find time to figure out a small set of graph transformation primitives that will be useful both for your problem and for other users of the library.

Nov 07 '18 04:11 snowleopard

Thanks @snowleopard. I'm going to see what PoC I can build on top of these primitives to further assess how this all might work for me.

Nov 09 '18 11:11 jmtd

alga alga copied to clipboard

Implement hasVertexP and hasEdgeP

alga
alga copied to clipboard