ggalluvial icon indicating copy to clipboard operation
ggalluvial copied to clipboard

Feature request: arrow-shaped flows?

Open ccotroneo opened this issue 3 years ago • 1 comments

Hi Cory,

thanks a lot for this package, it really looks great! Just a quick question: is there already a way or will there be a way to shape the different flow lines as arrows? :)

ccotroneo avatar Oct 23 '20 12:10 ccotroneo

Hi @ccotroneo — i'm glad the package appeals, and thanks for the suggestion! Please pardon the long-winded response.

I can see the appeal of this feature, given that the typical alluvial plot is only directional insofar as the horizontal axis has an obvious meaning (like a unit of time or the stage of a process). So arrows would help indicate directionality in cases where the axis itself doesn't.

why i'm not sure

I'm resistant in part because i have trouble coming up with a natural use case, in which arrows would add meaning to the plot that cannot be added by the axes. Do you have a specific one in mind? (One possibility is the vaccination survey data installed with ggalluvial, which includes data from three surveys conducted at different times but does not include their dates.)

My guiding principle here is that the statistical graphic should be completely determined by the data. For example, a natural use case for arrows is a network model of trade; visualized as a graph, two nodes (representing, say, countries) may have two, one, or no arrows connecting them, depending on which imports from the other.

Another reason is that i've tried to distinguish alluvial plots from the much larger species of Sankey diagrams. These usually represent directed flows and often have much freer forms, which may render horizontal and vertical axes meaningless and make arrows more appropriate and essential (and traditional). I don't know of any such ggplot2 extensions, unfortunately.

how it might work

An important question to ask about this feature would be how the directions of arrows are determined. Presumably one wouldn't just want them to always point rightward (or, more generally, in the direction of the axis[0-9]+ aesthetics or in the direction of the variable passed to x). (Otherwise, the arrows wouldn't communicate anything that the horizontal axis didn't already.) But it would also be infeasible (not to mention ungrammatical) for the user to manually specify the direction of each flow or alluvium element in the plot.

This might be done using an additional aesthetic variable, at least for stat_alluvium(). For data in alluvia/wide form, a single binary variable could indicate the direction of each alluvium (leftward versus rightward). If the data are in lodes/long form, then this variable could still be summarized (throwing an error in case of incompatible values) in order to determine the direction of each alluvium.

Flows, using stat_flow(), would not be as easy. For data in alluvia/wide form, the layer could just direct each flow as it would the alluvium it's a part of. For data in lodes/long form, though, how could the user use a lode-level variable, say data$var, to control the direction of flow between pairs of lodes? One way would be to direct the flow between lodes i and j according to sort(data$var[c(i, j)]) — that is, from i to j if sort(data$var[c(i, j)]) == data$var[c(i, j)] and from j to i if sort(data$var[c(i, j)]) == data$var[c(j, i)]. But this would impose a transitivity relation on the lodes that the user might not want.

how it might look

Finally, it might be tricky to represent the heads and tails of arrows of flows in a way that does not cause graphical elements to overlap. For example, here's a very basic idea of how a flow could look directed:

library(grid)
grid.newpage()
# open 2-plot layout
pushViewport(viewport(width = 1, x = 0, just = "left",
                      layout = grid.layout(2, 1, respect = TRUE)))
# standard flow
pushViewport(viewport(layout.pos.row = 1, layout.pos.col = 1))
grid.xspline(
  x = c(0, .25, .75, 1, 1, .75, .25, 0),
  y = c(0, 0, .8, .8, 1, 1, .2, .2),
  shape = c(0, 1, 1, 0, 0, 1, 1, 0),
  open = FALSE, gp = gpar(fill = "lightgrey")
)
popViewport()
# directed flow
pushViewport(viewport(layout.pos.row = 2, layout.pos.col = 1))
grid.xspline(
  x = c(0, .1, .275, .725, .9, 1, .9, .725, .275, .1, 0, .1),
  y = c(0, 0, 0, .8, .8, .9, 1, 1, .2, .2, .2, .1),
  shape = c(0, 0, 1, 1, 0, 0, 0, 1, 1, 0, 0, 0),
  open = FALSE, gp = gpar(fill = "lightgrey")
)
popViewport()
# close 2-plot layout
popViewport()

Created on 2020-10-23 by the reprex package (v0.3.0)

The potential problem with this graphical element is that, in order to prevent the arrow head from overlapping with the lode (rectangle) incident to it (which could cause miscoloration among other problems), the two ends of the basic ribbon are contracted to make room for the head and tail to protrude outward. This isn't a huge problem necessarily, but it would make an alluvial plot somewhat more crowded.

what to do next

If you (or anyone reading this) have suggestions on how to better resolve some of these issues, i'd be interested to hear them! At the moment, though, i don't feel ready to implement such a feature myself.

corybrunson avatar Oct 23 '20 20:10 corybrunson