rigraph icon indicating copy to clipboard operation
rigraph copied to clipboard

Disturbed layout in subgraph

Open clpippel opened this issue 2 years ago • 15 comments

  • Consider the code below.
## bug / undesired / unexpected behaviour.
g        <- make_ring(10)
g$layout <- layout_in_circle(g)
sg       <- subgraph(g, c(1,2))
## delete_graph_attr(sg, "layout")
plot(sg)

Note that vertices 1 and 2 are plotted multiple times. This is because the layout vector of the original graph is copied to the sub graph.

Version information

  • R/igraph version: [1] igraph_1.4.1

subgraph-layout-bug

clpippel avatar Apr 11 '23 18:04 clpippel

Probably a consequence of R's vector recycling behaviour?

The graph attribute named layout is special in the sense that you can assign a matrix there and igraph's layout_nicely() layout algorithm (which is the default) uses that matrix to decide where the vertices should be. We should adjust the plot.igraph method to make sure that only the first few rows of the matrix are used if the matrix has more rows than the number of vertices in the graph.

ntamas avatar Apr 11 '23 20:04 ntamas

@krlmlr Feel free to re-assign it to one of your teammates if needed.

ntamas avatar Apr 11 '23 20:04 ntamas

Related: https://github.com/igraph/rigraph/wiki/Visualization-in-igraph-2

szhorvat avatar Apr 11 '23 20:04 szhorvat

The problem is that the layout matrix of the subgraph is inherited from the original graph. To avoid this, the graph$layout attribute should not be copied to the subgraph. Or else the coordinates of the missing vertices should be removed (nicer solution).

clpippel avatar Apr 11 '23 20:04 clpippel

graph$layout can contain different types of values. Some of them are safe to copy (layout algorithm names) and some aren't (coordinate matrix). IMO storing the coordinates as a graph attribute (and not as vertex attributes) was a mistake. This is what I tried to describe in the wiki page that summarizes potential improvements to visualization in igraph 2.

@krlmlr @vtraag These sorts of issues are why I think that visualization can't (or shouldn't) be handed off wholesale to another package in igraph 2. Part of it will always be igraph's business. The storage of coordinates or visualization methods, and the expected behaviour with various stored visualization information is just one example.

szhorvat avatar Apr 11 '23 20:04 szhorvat

What are action points here then according to you @szhorvat? A big overhaul of how the coordinates are stored?

maelle avatar Feb 26 '24 14:02 maelle

A big overhaul of how the coordinates are stored?

That's the only thing I think we can do ... will there ever be enough resources for this? I don't know.

See the old igraph 2 planning document: https://github.com/igraph/rigraph/wiki/igraph-2-planning Ambitions got deflated quite a bit since then, as it became here how much all of this would take.

szhorvat avatar Feb 26 '24 15:02 szhorvat

Should this be closed as non planned then, as the idea is tracked in that document?

maelle avatar Feb 26 '24 15:02 maelle

That document does not replace the issue tracker, and is in need of an update. It's up to Kirill to judge how big a project we can take on. There were several ideas floating around, including offloading visualization to other packages entirely (like ggraph). Personally I don't favour that, but I admit there are good reasons to go that route.

I wouldn't close this issue because it describes a very real problem that occurs frequently when working with igraph.

szhorvat avatar Feb 26 '24 15:02 szhorvat

Layouting belongs in igraph, not so sure about plotting.

Reprex:

options(conflicts.policy = list(warn = FALSE))
library(igraph)

g <- make_ring(10)
g$layout <- layout_in_circle(g)
sg <- subgraph(g, c(1, 2))

gs <- make_star(2)
gs$layout <- layout_in_circle(gs)

waldo::compare(sg$layout[, 1], gs$layout[, 1])
#>      old                | new    
#>  [1] 1                  | 1   [1]
#>  [2] 0.809016994374947  - -1  [2]
#>  [3] 0.309016994374947  -        
#>  [4] -0.309016994374947 -        
#>  [5] -0.809016994374947 -        
#>  [6] -1                 -        
#>  [7] -0.809016994374947 -        
#>  [8] -0.309016994374948 -        
#>  [9] 0.309016994374947  -        
#> [10] 0.809016994374947  -
waldo::compare(sg$layout[, 2], gs$layout[, 2])
#>      old                  | new                     
#>  [1] 0                    | 0                    [1]
#>  [2] 0.587785252292473    - 1.22464679914735e-16 [2]
#>  [3] 0.951056516295154    -                         
#>  [4] 0.951056516295154    -                         
#>  [5] 0.587785252292473    -                         
#>  [6] 1.22464679914735e-16 -                         
#>  [7] -0.587785252292473   -                         
#>  [8] -0.951056516295154   -                         
#>  [9] -0.951056516295154   -                         
#> [10] -0.587785252292473   -

Created on 2024-03-03 with reprex v2.1.0

The problem is that layout is stored as a graph attribute where it should probably be a vertex attribute instead. Matrices with m rows and n columns are treated as vectors of length m by the vctrs framework, it would be a good fit. We now only need to teach functions to preferentially look for a vertex attribute "layout", and perhaps to warn when "layout" is assigned as a graph attribute.

krlmlr avatar Mar 03 '24 19:03 krlmlr

Vertex attributes x and y are already treated this way if a graph-level layout attribute is not present. See the source code of layout_nicely -- if there is no graph attribute called layout, cbind(V(graph)$x, V(graph)$y) is used to create a layout matrix.

ntamas avatar Mar 03 '24 23:03 ntamas

A vertex attribute layout could be added to that logic then. We'd have to see if matrices can be used as vectors in Arrow though, for moving attributes to the core. Currently:

m <- matrix(1:6, ncol = 3)

data <- tibble::tibble(id = 1:2, m)
data
#> # A tibble: 2 × 2
#>      id m[,1]  [,2]  [,3]
#>   <int> <int> <int> <int>
#> 1     1     1     3     5
#> 2     2     2     4     6

arrow::as_arrow_table(data)
#> Error: Invalid: All columns must have the same length
arrow::as_arrow_array(data)
#> Error: Invalid: All arrays must have the same length

Created on 2024-03-04 with reprex v2.1.0

@paleolimbot: Does Arrow support matrix columns in principle? If yes, what could the support in R look like? What about Python and other systems?

krlmlr avatar Mar 04 '24 03:03 krlmlr

Can't we simply keep on using $x and $y instead of adding support for $layout on the vertex level? (Note that there's also an optional $z vertex attribute that is used for 3D layouts).

ntamas avatar Mar 04 '24 07:03 ntamas

Does Arrow support matrix columns in principle? If yes, what could the support in R look like? What about Python and other systems?

Yes, there are "fixed-size tensor" and "variable-size tensor" extension types: https://arrow.apache.org/docs/format/CanonicalExtensions.html#fixed-shape-tensor , although neither are supported in the R bindings or nanoarrow (or: there is no built-in conversion to R objects). I prototyped it here when the discussion was ongoing: https://gist.github.com/paleolimbot/c42f068c2b8b98255dbfbe379d905607 .

I am pretty sure that in Python/pyarrow it has built-in/zero-copy convert to numpy. It could in theory be wired up in nanoarrow...I'm in the middle of a refactor to make the conversion process understandable here: https://github.com/apache/arrow-nanoarrow/pull/392 , and after that's done it should be more straightforward how to contribute that.

paleolimbot avatar Mar 04 '24 15:03 paleolimbot

Thanks, @paleolimbot. Looks like we'll stay with plain vector attributes for now.

krlmlr avatar Mar 12 '24 14:03 krlmlr

fixed in #1880

schochastics avatar Jul 08 '25 19:07 schochastics