get.edge.ids() does not handle multi-edges
I would expect the following to return 1, 2, 3 instead of 3:
> g <- make_graph(c(1,2, 1,2, 1,2))
> get.edge.ids(g, c(1,2), multi=T)
[1] 3
I am using the current dev branch.
The relevant function should be:
https://github.com/igraph/rigraph/blob/dev/tools/stimulus/rinterface.c.in#L8363
which dispatches to igraph_get_eids_multi() as it should. What am I misunderstanding here, and why does it not work?
Not being familiar enough with R/igraph internals yet, it is unclear to me what C_R_igraph_get_eids is. Does it map to igraph_get_eids through stimulus (which would explain the behaviour) or to R_igraph_get_eids (which is what should happen)?
C_R_igraph_get_eids is actually the same as R_igraph_get_eids. The NAMESPACE file of the igraph R package arranges for all methods listed in the R-C glue code (see init.c) to be registered with a C_ prefix in the R namespace;this is what the last line of the NAMESPACE file does:
useDynLib(igraph, .registration = TRUE, .fixes = "C_")
So, when you call get.edge.ids(), it calls C_R_igraph_get_eids (in the R namespace), which maps to R_igraph_get_eids (in the C namespace). I don't understand yet why it returns a single edge ID, though.
Apparently igraph_get_eids_multi() still needs you to define the same pair multiple times if you want all edge IDs between them; this is in the C docs:
This function handles multiple edges properly, i.e. if the same pair is given multiple times and they are indeed connected by multiple edges, then each time a different edge ID is reported.
You are right, that indeed works:
> get.edge.ids(g, c(1,2, 1,2, 1,2), multi=T)
[1] 3 2 1
> get.edge.ids(g, c(1,2, 1,2, 1,2), multi=F)
[1] 3 3 3
This is quite confusing though ...
I wonder what the best way is to:
- Get all edges between a pair of vertices. With
get.edge.idsandcount_multipleit is possible, but painful. - Group parallel edges together. For example, given 1-2, 2-3, 1-2, 2-3, 2-3, one might want to get the edge ID sets ((1, 3), (2, 4, 5)). Possible based on the above, but even more painful.
I think it's also very confusing that which_multiple (and igraph_is_multiple()) does not mark the first parallel edge between any vertex pair as "true". We should review these behaviours in the context of actual practical use cases, as well as performance considerations.
Not marking the first edge as true makes sense depending on the context; in R or NumPy you can use a Boolean vector for indexing, so indexing an edge list with the result of !is_multiple() gives you all edges such that there is only one edge between any two vertex pairs.
Not marking the first edge as true makes sense depending on the context
This is also the semantics of the duplicated() function in R.
As for get.edge.ids(), the contract is that we return a single edge id for each node-pair queried. See the example in the manual.
Btw. changing these semantics would be breaking changes, so if you want new semantics I suggest that you introduce new functions, and prefer them over the current ones, e.g. in the documentation.
Let's keep this open for a little while. When I get the time, I'll think it through whether it's necessary to have functions with different semantics, and will write up a couple of example use cases in a new issue, then close this one. Until then let this issue serve as a reminder for me (this is why I assigned it to myself).
Note: in igraph 0.10 we now have igraph_get_all_eids_between(), which will probably solve the problem that is outlined here. We "just" need to transition the C core of the R interface from 0.9 to 0.10 without breaking too many things :)