graphs
graphs copied to clipboard
Interest in other graph libraries
Thanks for this - graph library performance is definitely not obvious. Do you have any interest in additional graph libraries to examine? I was a bit frustrated with fgl performance a while ago and put together a library that I thought would address most of the problems I had with it. It might be of interest to you: https://github.com/travitch/haggle (unfortunately, not up on hackage yet).
It has:
- A few different data representations to accommodate different graph topologies, some of which are very space efficient
- Graphs with/without edge and node labels
- Optional mutable graph creation (in IO/ST) via a freeze and thaw interface
Do you have any interest in additional graph libraries to examine?
Yes of course, thanks for the link !
The approach is very interesting, I will add benchs for haggle asap :)
@travitch I just spent some time trying to add your library. Some questions, to ensure that I don't miss anything:
- If I want to deal with simple directed graph (like the graphs of
FGL
), isData.Graph.Haggle.SimpleBiDigraph
the best choice ? - If yes, if there a possible
NFData
instance forMSimpleBiDigraph
andSimpleBiDigraph
? Maybe the first can be made by freezing the mutable graph and then use a the one defined we can define forSimpleBeDiGraph
? - You don't provide anything like
mkGraph
, so I wrote:
mkGraph e = let maxV = extractMaxVertex e
in do
g <- newSizedMSimpleBiDigraph maxV undefined
vert <- replicateM maxV (addVertex g)
mapM_ (\(u,v) -> addEdge g (vert !! u)(vert !! v)) e
return g
where e
is a list of edges (of type [(Int,Int)]
and maV
is maxV = max $ (\x y -> x ++ y) $ unzip e
. The vertices in the graph are made from [0..(maxV-1)]
.
Is it ok ?
Thanks for taking a look at it!
- Which graph is best depends a bit on what you want to test. The most direct analog to FGL is the
PatriciaTree
, which should be almost the same performance (but with some better strictness properties).SimpleBiDigraph
will be more compact in memory and should probably be faster, while still supporting most of the same operations.Digraph
might be interesting as well - it is much more compact, but 1) doesn't provide access to predecessors for nodes, and 2) has an expensive test for edge existence. - I just added the missing
NFData
instances for the graph types. I don't think it is really possible to make instances for the mutable graphs, as they contain IO/STRefs. Freezing the graph and then using theNFData
instance for the immutable graphs is probably the best bet there. - That is a good way to construct graphs. I didn't provide that function, as I ran into a number of use cases where that wasn't quite enough for my needs (e.g., when there are nodes in the graph that don't have any edges to them, the edge list construction is awkward).
Thanks for the work :)
I started mine here: https://github.com/haskell-perf/graphs/tree/haggle (no many things are benchmarked, I will dig deeper inside the API soon).
I think the more valuable is SimpleBiDiGraph
since it is closer to others libraries but with a different implementation.
I am just stuck with a thing: How can I bench a graph modification, since it is encapsulated inside a PrimMonad
? Sadly, I have no NFData
instance available... Do you have any idea ? Ideally, I would fully evaluate the graph creation (to ensure no time is wasted after), then adding my edge, then fully evaluate the result. I don't know if this is possible...
If you are running in the IO
Monad, the modifications to the underlying data structures (which are MVector
s and IORef
s) happen exactly when you expect. The modifications to IORef
s are done using modifyRef'
, which evaluates its argument to WHNF (since they are all Int
s, that is NF). I guess my argument is that there isn't anything else to force, but I don't have a proof that there are no other thunks lying around.
For the get*
operations on MSimpleBiDigraph
, you could deepseq
the returned values, at least.
If we are sure that everything is evaluated, as you are saying, then the NFData
instance could simply be const ()
, it should do the trick... I will make some experiments ^^.
The problem of get*
operations is that they come encapsulated in the PrimMonad
so the NFData
is not simple...
So I have do some work.
Criterion provide a function to benchmark a IO action (here it apply for mutable graphs). The only problem is that it will benchmark the whole action. So I need to benchmark the creation itself with the "action" (like addEdge
) then fully evaluate the results.
Fortunately, I have implemented it for others libraries, so we can compare :)
I think my function to build a graph from a list of edges is very poor. Here are the results:
Doing:
- edgeCount
- vertexCount
- addEdge
Using [("Mesh",3),("Clique",3)] as graphs
edgeCount
Description: Count the edges of the graph
Mesh
1 | 10 | 100 | |
---|---|---|---|
Alga | 51.52 ns | 7.943 μs | 163.2 μs |
Containers | 30.87 ns | 388.1 ns | 4.460 μs |
Fgl | 36.41 ns | 5.869 μs | 202.5 μs |
Haggle | 914.9 ns | 34.31 μs | 455.7 μs |
Hash-Graph | 39.54 ns | 5.985 μs | 198.2 μs |
Clique
1 | 10 | 100 | |
---|---|---|---|
Alga | 51.51 ns | 29.03 μs | 6.486 ms |
Containers | 31.38 ns | 1.051 μs | 107.4 μs |
Fgl | 36.31 ns | 17.85 μs | 10.12 ms |
Haggle | 904.9 ns | 92.23 μs | 13.55 ms |
Hash-Graph | 39.48 ns | 19.89 μs | 7.820 ms |
SUMMARY:
- Containers was the fastest 6 times
ABSTRACT: (Based on an average of the ratio between largest benchmarks)
- Containers was 112.93 times faster than Haggle
- Alga was 2.48 times faster than Haggle
- Hash-Graph was 2.05 times faster than Haggle
- Fgl was 1.84 times faster than Haggle
vertexCount
Description: Count the vertices of the graph
Mesh
1 | 10 | 100 | |
---|---|---|---|
Alga | 35.03 ns | 910.1 ns | 14.45 μs |
Fgl | 25.93 ns | 4.446 μs | 182.4 μs |
Haggle | 919.7 ns | 33.88 μs | 456.4 μs |
Hash-Graph | 26.36 ns | 5.281 μs | 192.5 μs |
Clique
1 | 10 | 100 | |
---|---|---|---|
Alga | 34.67 ns | 3.247 μs | 798.9 μs |
Fgl | 25.66 ns | 12.37 μs | 5.190 ms |
Haggle | 899.5 ns | 93.55 μs | 13.26 ms |
Hash-Graph | 25.62 ns | 18.58 μs | 7.770 ms |
SUMMARY:
- Alga was the fastest 4 times
There was 2 ex-aequo
ABSTRACT: (Based on an average of the ratio between largest benchmarks)
- Alga was 21.76 times faster than Haggle
- Fgl was 2.54 times faster than Haggle
- Hash-Graph was 1.94 times faster than Haggle
addEdge
Description: Add an edge (not already in the graph)
Mesh
1 | 10 | 100 | |
---|---|---|---|
Alga | 50.26 ns | 747.4 ns | 9.611 μs |
Fgl | 132.0 ns | 6.361 μs | 213.3 μs |
Haggle | 1.291 μs | 35.80 μs | 461.4 μs |
Hash-Graph | 159.8 ns | 7.108 μs | 212.8 μs |
Clique
1 | 10 | 100 | |
---|---|---|---|
Alga | 50.43 ns | 2.484 μs | 257.6 μs |
Fgl | 133.2 ns | 19.71 μs | 10.41 ms |
Haggle | 1.328 μs | 95.85 μs | 13.15 ms |
Hash-Graph | 164.3 ns | 23.07 μs | 8.069 ms |
SUMMARY:
- Alga was the fastest 18 times
ABSTRACT: (Based on an average of the ratio between largest benchmarks)
- Alga was 49.48 times faster than Haggle
- Hash-Graph was 1.91 times faster than Haggle
- Fgl was 1.73 times faster than Haggle
So I think my function is so poor that it kills the results...