OldGraphs.jl
OldGraphs.jl copied to clipboard
Testing Branch (Naive edge/vertex removal)
@pozorvlak Here is the failed attempt at testing remove_edge!
. Even though testing works here, it still does not work locally.
This may explain why the build is failing:
julia> g = simple_graph(5, is_directed=false)
Undirected Graph (5 vertices, 0 edges)
julia> add_edge!(g, 1, 2)
edge [1]: 1 -- 2
julia> add_edge!(g, 1, 3)
edge [2]: 1 -- 3
julia> add_edge!(g, 1, 4)
edge [3]: 1 -- 4
julia> add_edge!(g, 1, 5)
edge [4]: 1 -- 5
julia> add_edge!(g, 2, 5)
edge [5]: 2 -- 5
julia> add_edge!(g, 4, 5)
edge [6]: 4 -- 5
julia> g
Undirected Graph (5 vertices, 6 edges)
julia> edges(g)
6-element Array{Edge{Int64},1}:
edge [1]: 1 -- 2
edge [2]: 1 -- 3
edge [3]: 1 -- 4
edge [4]: 1 -- 5
edge [5]: 2 -- 5
edge [6]: 4 -- 5
julia> remove_edge!(g, 1, 3)
A edge [2]: 3 -- 1
B edge [2]: 3 -- 1
removed edge at index 1
C edge [1]: 2 -- 1
C edge [2]: 3 -- 1
removed edge at index 2
julia> remove_edge!(g, 1, 4)
A edge [3]: 4 -- 1
B edge [3]: 4 -- 1
removed edge at index 1
C edge [1]: 2 -- 1
C edge [3]: 4 -- 1
removed edge at index 2
julia> remove_edge!(g, 2, 5)
ERROR: BoundsError()
julia> edges(g)
4-element Array{Edge{Int64},1}:
edge [1]: 1 -- 2
edge [3]: 1 -- 4
edge [5]: 2 -- 5
edge [6]: 4 -- 5
julia> remove_edge!(g, 2, 5)
ERROR: BoundsError()
julia> remove_edge!(g, 4, 5)
ERROR: BoundsError()
julia> remove_edge!(g, 1, 2)
A edge [1]: 2 -- 1
B edge [1]: 2 -- 1
removed edge at index 1
C edge [1]: 2 -- 1
removed edge at index 1
julia> g
Undirected Graph (5 vertices, 3 edges)
julia> add_edge!(g, 3, 4)
edge [4]: 3 -- 4
julia> remove_edge!(g, 3, 4)
A edge [4]: 4 -- 3
B edge [4]: 4 -- 3
removed edge at index 1
C edge [4]: 4 -- 3
removed edge at index 1
julia> remove_edge!(g, 2, 5)
ERROR: BoundsError()
julia> edges(g)
3-element Array{Edge{Int64},1}:
edge [3]: 1 -- 4
edge [5]: 2 -- 5
edge [6]: 4 -- 5
julia> edge = edges(g)[2]
edge [5]: 2 -- 5
julia> edge_index(edge, g)
5
julia> splice!(g.edges, edge_index(edge, g))
ERROR: BoundsError()
julia> length(g.edges)
3
In other words, the edge index does not change as the array of edges changes. Hence, it is possible that an edge will have an index that is greater than the length of the edge array. I think I can hack my way around this but I'm wondering if this is cause for enough concern that Graphs.jl
should change the way it indexes edges. Any thoughts?
Ah, yes, that makes sense. Huh. I'll think about it, but I'll be busy moving house for the next few days so can't devote too much attention to it right now - sorry!
No worries. I need a break from this too. :)
For future reference:
If e in edges(g) == true
, shouldn't it follow that revedge(e) in edges(g) == true
for undirected graphs?
@pozorvlak Any headway regarding edge indexing?
@tfgit, @pozorvlak, I ran across a similar issue with deletes from OrderedDicts
(in DataStructures.jl
). Just a couple of thoughts:
- If it's okay for edge numbers to change, then you could just "unset" the entry, instead of deleting it, and periodically compact and renumber (that's how
OrderedDicts
handles it). - As long as the list is maintained in sorted order,
searchsortedfirst
will give youO(n log n)
lookups of the index, or aDict
will give youO(1)
(with a large constant). Neither is as nice as direct indexing, of course, and especially for large graphs, you would really want something better. But it would work.
A graph needs additional mechanism to maintain the edge list (e.g. renumbering, internal hashing, etc) in order to support the functionality of removing edges. This would involve considerable costs, and is not something that every graph need to have.
I think a better approach is to introduce new graph types and only implement remove_edge
over those graphs (and maybe we introduce a new interface name for them).
@lindahua From the user's perspective, the ability to remove or exclude edges/vertices is an essential part of graph construction. It would be strange for the default graph object to not have that kind of functionality. Perhaps the opposite would be more appropriate: let the default graph type have edge/vertex removal and introduce a "fast graph" type without the removal abilities.
I'm still willing to help out with this (I can't use the package without edge/vertex removal), but can you (or anyone) steer me to the right direction regarding making my PRs of acceptable quality? I'm not adept at optimizing Julia.
I'd like to second @tcfuji on this. There should be (at least) a class of graph types that allows for removal of edges / vertices. Applications modeling dynamic graphs demand such functionality and it's present in other graph libraries.
If there are performance issues related to node/edge removal, then those should be explicitly documented, but there is a practical need for such behavior.
I suspect if somebody submits a pull request, others will chime in with recommendations about how to make it more efficient.
Comments appreciated: https://gist.github.com/sbromberger/1dddcaed48bf0f487ec9
This works for SimpleGraph
. It was much easier to regenerate a graph with the removed components than it was to try to reindex it on the fly.
Example:
g = simple_graph(4)
add_edge!(g,1,2); add_edge!(g,1,3); add_edge!(g,2,3); add_edge!(g,3,4)
dynamize!(g)
delete_vertex!(g,2)
reindex!(g)
(Edited to change the gist)
(Thanks to @mlubin for the idea of regenerating.)
I would call this "semi-dynamic"; in particular, if you have a work flow in which you (say) use a priority queue to make some decisions after each node/edge addition/deletion, then reindex!
will be a bottleneck, because you'll have to regenerate the whole graph after making a single change. I'm not saying there isn't room for this kind of implementation, just pointing out the main weakness.
What about this kind of organization? (Note I haven't read through the Graphs.jl code in a long time, so this is surely not idiomatic.)
immutable G # this is a "DynamicUndirectedGraph", but that was too long to type
links::Array{Set{Int},1}
deleted::Array{Bool, 1}
end
function add_edge!(g::G, i, j)
push!(g.links[i], j)
push!(g.links[j], i)
end
function delete_vertex!(g, i)
for j in g.links[i]
delete!(g.links[j], i)
end
empty!(g.links[i])
g.deleted[i] = true
end
julia> g = G([Set{Int}() for i = 1:4], [false for i = 1:4])
G([Set{Int64}(),Set{Int64}(),Set{Int64}(),Set{Int64}()],Bool[false,false,false,false])
julia> add_edge!(g,1,2); add_edge!(g,1,3); add_edge!(g,2,3); add_edge!(g,3,4)
Set([3])
julia> g
G([Set([2,3]),Set([3,1]),Set([4,2,1]),Set([3])],Bool[false,false,false,false])
julia> delete_vertex!(g, 2)
true
julia> g
G([Set([3]),Set{Int64}(),Set([4,1]),Set([3])],Bool[false,true,false,false])
Note that this also allows an easy implementation of delete_edge!(g, i, j)
.
I think one of the issues with this approach is that the graph still requires incidence lists in order to take advantage of things like dijkstra_shortest_paths
- and there's a tradeoff between ease/speed of graph modification and ease/speed of generating the incidence list (among other attributes - there's no practical way of implementing edge_index
in that example above, and that's a required interface).
Unless I'm completely wrong in my understanding of that code, in which case feel free to ignore.
ETA: I spent a fair amount of time on this yesterday, so wanted to get these thoughts down while they were still fresh: there seems to be no great way of creating dynamic graphs (those with removable vertices/edges) that doesn't have a tradeoff either in standard performance or in reindexing/removal.
In my particular case, I do one set of removals for every couple hundred thousand other operations, so it's better to have the removals (via reindex!
) be expensive relative to the other operations. @timholy - you're right that "dynamic" may be a misnomer.
There are also hidden tradeoffs in the code above (and my code) - multigraphs are specifically disallowed. That is, you can create a multigraph, but a call to delete_edge!
will remove all of them.
In my implementation the graph is undirected, so every outgoing link is also an incoming one. Thus, g.links[i]
is the incidence list for vertex i
.
To me it seems that vertex_index
and edge_index
are tightly coupled to a particular implementation (based on Vector containers); personally I'm skeptical that those should be considered first-class citizens.
All the rest of the functions in http://graphsjl-docs.readthedocs.org/en/latest/interface.html do look more generic, though.
@timholy vertex_index
is ok; but edge_index
- which is used for dijkstra_shortest_paths
, for example, is impossible with sets (unless there's some functionality I don't know about)*. @lindahua has already stated that we should be revisiting indexing, but after having tried several alternate implementations over the last couple of months, I don't know a better way of doing it (except for simplifying the graph structure entirely - see, e.g., #143.
*What made this particularly frustrating yesterday is that there was nothing in the documentation that described an additional constraint of edge_index
- namely, that each index needed to be <= num_vertices
. That undocumented limitation destroyed a clever bit of work I did using object_id
to create unique edge indices.
It seems like edge_index
is only used to retrieve edge_properties
, and (while I don't understand the Graphs.jl code in detail) I don't see why one couldn't fetch edge properties based on Pair(i,j)
.
Note also that dijkstra_shortest_paths
does not declare a dependency on the EdgeMap
interface.
Oy. You may be right there - it's begun to blur. ISTR there was some functionality that didn't work without edge_index
, but perhaps I was mistaken that it was d_s_p
. I'll try to remember correctly. d_s_p
requires the same of vertex_index
, however - namely, that each index be <= num_vertices
. That's why object_id
didn't work for vertices.