OldGraphs.jl Testing Branch (Naive edge/vertex removal)

@pozorvlak Here is the failed attempt at testing remove_edge!. Even though testing works here, it still does not work locally.

May 08 '14 21:05 tcfuji

This may explain why the build is failing:

julia> g = simple_graph(5, is_directed=false)
Undirected Graph (5 vertices, 0 edges)

julia> add_edge!(g, 1, 2)
edge [1]: 1 -- 2

julia> add_edge!(g, 1, 3)
edge [2]: 1 -- 3

julia> add_edge!(g, 1, 4)
edge [3]: 1 -- 4

julia> add_edge!(g, 1, 5)
edge [4]: 1 -- 5

julia> add_edge!(g, 2, 5)
edge [5]: 2 -- 5

julia> add_edge!(g, 4, 5)
edge [6]: 4 -- 5

julia> g
Undirected Graph (5 vertices, 6 edges)

julia> edges(g)
6-element Array{Edge{Int64},1}:
 edge [1]: 1 -- 2
 edge [2]: 1 -- 3
 edge [3]: 1 -- 4
 edge [4]: 1 -- 5
 edge [5]: 2 -- 5
 edge [6]: 4 -- 5

julia> remove_edge!(g, 1, 3)
A edge [2]: 3 -- 1
B edge [2]: 3 -- 1
removed edge at index 1
C edge [1]: 2 -- 1
C edge [2]: 3 -- 1
removed edge at index 2

julia> remove_edge!(g, 1, 4)
A edge [3]: 4 -- 1
B edge [3]: 4 -- 1
removed edge at index 1
C edge [1]: 2 -- 1
C edge [3]: 4 -- 1
removed edge at index 2

julia> remove_edge!(g, 2, 5)
ERROR: BoundsError()

julia> edges(g)
4-element Array{Edge{Int64},1}:
 edge [1]: 1 -- 2
 edge [3]: 1 -- 4
 edge [5]: 2 -- 5
 edge [6]: 4 -- 5

julia> remove_edge!(g, 2, 5)
ERROR: BoundsError()



julia> remove_edge!(g, 4, 5)
ERROR: BoundsError()

julia> remove_edge!(g, 1, 2)
A edge [1]: 2 -- 1
B edge [1]: 2 -- 1
removed edge at index 1
C edge [1]: 2 -- 1
removed edge at index 1

julia> g
Undirected Graph (5 vertices, 3 edges)

julia> add_edge!(g, 3, 4)
edge [4]: 3 -- 4

julia> remove_edge!(g, 3, 4)
A edge [4]: 4 -- 3
B edge [4]: 4 -- 3
removed edge at index 1
C edge [4]: 4 -- 3
removed edge at index 1

julia> remove_edge!(g, 2, 5)
ERROR: BoundsError()

julia> edges(g)
3-element Array{Edge{Int64},1}:
 edge [3]: 1 -- 4
 edge [5]: 2 -- 5
 edge [6]: 4 -- 5

julia> edge = edges(g)[2]
edge [5]: 2 -- 5

julia> edge_index(edge, g)
5

julia> splice!(g.edges, edge_index(edge, g))
ERROR: BoundsError()

julia> length(g.edges)
3

In other words, the edge index does not change as the array of edges changes. Hence, it is possible that an edge will have an index that is greater than the length of the edge array. I think I can hack my way around this but I'm wondering if this is cause for enough concern that Graphs.jl should change the way it indexes edges. Any thoughts?

May 09 '14 05:05 tcfuji

Ah, yes, that makes sense. Huh. I'll think about it, but I'll be busy moving house for the next few days so can't devote too much attention to it right now - sorry!

May 09 '14 08:05 pozorvlak

No worries. I need a break from this too. :)

For future reference: If e in edges(g) == true, shouldn't it follow that revedge(e) in edges(g) == true for undirected graphs?

May 09 '14 16:05 tcfuji

@pozorvlak Any headway regarding edge indexing?

May 26 '14 01:05 tcfuji

@tfgit, @pozorvlak, I ran across a similar issue with deletes from OrderedDicts (in DataStructures.jl). Just a couple of thoughts:

If it's okay for edge numbers to change, then you could just "unset" the entry, instead of deleting it, and periodically compact and renumber (that's how OrderedDicts handles it).
As long as the list is maintained in sorted order, searchsortedfirst will give you O(n log n) lookups of the index, or a Dict will give you O(1) (with a large constant). Neither is as nice as direct indexing, of course, and especially for large graphs, you would really want something better. But it would work.

May 26 '14 05:05 kmsquire

A graph needs additional mechanism to maintain the edge list (e.g. renumbering, internal hashing, etc) in order to support the functionality of removing edges. This would involve considerable costs, and is not something that every graph need to have.

I think a better approach is to introduce new graph types and only implement remove_edge over those graphs (and maybe we introduce a new interface name for them).

May 26 '14 13:05 lindahua

@lindahua From the user's perspective, the ability to remove or exclude edges/vertices is an essential part of graph construction. It would be strange for the default graph object to not have that kind of functionality. Perhaps the opposite would be more appropriate: let the default graph type have edge/vertex removal and introduce a "fast graph" type without the removal abilities.

I'm still willing to help out with this (I can't use the package without edge/vertex removal), but can you (or anyone) steer me to the right direction regarding making my PRs of acceptable quality? I'm not adept at optimizing Julia.

Aug 21 '14 09:08 tcfuji

I'd like to second @tcfuji on this. There should be (at least) a class of graph types that allows for removal of edges / vertices. Applications modeling dynamic graphs demand such functionality and it's present in other graph libraries.

If there are performance issues related to node/edge removal, then those should be explicitly documented, but there is a practical need for such behavior.

Dec 31 '14 16:12 sbromberger

I suspect if somebody submits a pull request, others will chime in with recommendations about how to make it more efficient.

Dec 31 '14 22:12 timholy

Comments appreciated: https://gist.github.com/sbromberger/1dddcaed48bf0f487ec9

This works for SimpleGraph. It was much easier to regenerate a graph with the removed components than it was to try to reindex it on the fly.

Example:

g = simple_graph(4)
add_edge!(g,1,2); add_edge!(g,1,3); add_edge!(g,2,3); add_edge!(g,3,4)
dynamize!(g)
delete_vertex!(g,2)
reindex!(g)

(Edited to change the gist)

(Thanks to @mlubin for the idea of regenerating.)

Jan 01 '15 21:01 sbromberger

I would call this "semi-dynamic"; in particular, if you have a work flow in which you (say) use a priority queue to make some decisions after each node/edge addition/deletion, then reindex! will be a bottleneck, because you'll have to regenerate the whole graph after making a single change. I'm not saying there isn't room for this kind of implementation, just pointing out the main weakness.

What about this kind of organization? (Note I haven't read through the Graphs.jl code in a long time, so this is surely not idiomatic.)

immutable G  # this is a "DynamicUndirectedGraph", but that was too long to type
    links::Array{Set{Int},1}
    deleted::Array{Bool, 1}
end

function add_edge!(g::G, i, j)
    push!(g.links[i], j)
    push!(g.links[j], i)
end

function delete_vertex!(g, i)
    for j in g.links[i]
        delete!(g.links[j], i)
    end
    empty!(g.links[i])
    g.deleted[i] = true
end

julia> g = G([Set{Int}() for i = 1:4], [false for i = 1:4])
G([Set{Int64}(),Set{Int64}(),Set{Int64}(),Set{Int64}()],Bool[false,false,false,false])

julia> add_edge!(g,1,2); add_edge!(g,1,3); add_edge!(g,2,3); add_edge!(g,3,4)
Set([3])

julia> g
G([Set([2,3]),Set([3,1]),Set([4,2,1]),Set([3])],Bool[false,false,false,false])

julia> delete_vertex!(g, 2)
true

julia> g
G([Set([3]),Set{Int64}(),Set([4,1]),Set([3])],Bool[false,true,false,false])

Jan 02 '15 16:01 timholy

Note that this also allows an easy implementation of delete_edge!(g, i, j).

Jan 02 '15 16:01 timholy

I think one of the issues with this approach is that the graph still requires incidence lists in order to take advantage of things like dijkstra_shortest_paths - and there's a tradeoff between ease/speed of graph modification and ease/speed of generating the incidence list (among other attributes - there's no practical way of implementing edge_index in that example above, and that's a required interface).

Unless I'm completely wrong in my understanding of that code, in which case feel free to ignore.

ETA: I spent a fair amount of time on this yesterday, so wanted to get these thoughts down while they were still fresh: there seems to be no great way of creating dynamic graphs (those with removable vertices/edges) that doesn't have a tradeoff either in standard performance or in reindexing/removal.

In my particular case, I do one set of removals for every couple hundred thousand other operations, so it's better to have the removals (via reindex!) be expensive relative to the other operations. @timholy - you're right that "dynamic" may be a misnomer.

There are also hidden tradeoffs in the code above (and my code) - multigraphs are specifically disallowed. That is, you can create a multigraph, but a call to delete_edge! will remove all of them.

Jan 02 '15 16:01 sbromberger

In my implementation the graph is undirected, so every outgoing link is also an incoming one. Thus, g.links[i] is the incidence list for vertex i.

To me it seems that vertex_index and edge_index are tightly coupled to a particular implementation (based on Vector containers); personally I'm skeptical that those should be considered first-class citizens.

Jan 02 '15 16:01 timholy

All the rest of the functions in http://graphsjl-docs.readthedocs.org/en/latest/interface.html do look more generic, though.

Jan 02 '15 16:01 timholy

@timholy vertex_index is ok; but edge_index - which is used for dijkstra_shortest_paths, for example, is impossible with sets (unless there's some functionality I don't know about)*. @lindahua has already stated that we should be revisiting indexing, but after having tried several alternate implementations over the last couple of months, I don't know a better way of doing it (except for simplifying the graph structure entirely - see, e.g., #143.

*What made this particularly frustrating yesterday is that there was nothing in the documentation that described an additional constraint of edge_index - namely, that each index needed to be <= num_vertices. That undocumented limitation destroyed a clever bit of work I did using object_id to create unique edge indices.

Jan 02 '15 16:01 sbromberger

It seems like edge_index is only used to retrieve edge_properties, and (while I don't understand the Graphs.jl code in detail) I don't see why one couldn't fetch edge properties based on Pair(i,j).

Jan 02 '15 17:01 timholy

Note also that dijkstra_shortest_paths does not declare a dependency on the EdgeMap interface.

Jan 02 '15 17:01 timholy

Oy. You may be right there - it's begun to blur. ISTR there was some functionality that didn't work without edge_index, but perhaps I was mistaken that it was d_s_p. I'll try to remember correctly. d_s_p requires the same of vertex_index, however - namely, that each index be <= num_vertices. That's why object_id didn't work for vertices.

Jan 02 '15 17:01 sbromberger

OldGraphs.jl OldGraphs.jl copied to clipboard

Testing Branch (Naive edge/vertex removal)

OldGraphs.jl
OldGraphs.jl copied to clipboard