Gadfly.jl
Gadfly.jl copied to clipboard
hexbin misalignment
Spoilers: I'm opening this issue even though I think it's partly a bug upstream in https://github.com/GiovineItalia/Hexagons.jl In fact, I'm hoping to open a PR there, though it may take me a while as I have no idea what I'm doing! If someone wants to beat me to it, please do. I'm also not sure about the etiquette of the upstream change, though luckily Gadfly is Hexagon's only direct dependent according to JuliaHub.
Apparent bug: rendered hexbins are misaligned with input data. Here's a small reproducing example:
plot(x = [-10,0,10], y = [-10,0,10], Geom.hexbin(xbincount=4, ybincount=4))
Notice how no hexagon covers (-10,-10)
or (10,10)
, and the bottom left hexagon covers no data points at all! It appears the hexagons have been shifted down and to the left from their proper positions.
I believe the issue begins here: https://github.com/GiovineItalia/Gadfly.jl/blob/master/src/statistics.jl#L1249-L1251
h = convert(HexagonOffsetOddR, cube_round(x - xmin + xspan/2,
y - ymin + yspan/2,
xsize, ysize))
An input (x,y)
is mapped to cubic hexagon coordinates (of the containing hexagon) using cube_round
from Hexagons.jl
. Unfortunately, cube_round
returns the cubic coordinates of the hexagon that contains an input (x,y)
assuming the zero-cubic-coordinate hexagon has xy-center (1,1)
. This is nasty, and exactly what my future upstream PR will hopefully fix. See this old issue: https://github.com/GiovineItalia/Hexagons.jl/issues/13
julia> Hexagons.cube_round(0,0)
HexagonCubic(0, 1, -1)
julia> Hexagons.cube_round(1,1)
HexagonCubic(0, 0, 0)
But when the hexagon-space coordinates are translated back to (x,y)
coordinates a few lines later:
x, y = center(HexagonOffsetOddR(idx[1], idx[2]), xsize, ysize,
xmin - xspan/2, ymin - yspan/2)
The (x,y)
offset corresponding to the center of the coordinate system does not correct for the implicit (1,1)
xy-center of the hexagon coordinate system used in cube_round
.
So the upstream fix is to make cube_round
(and center
) from Hexagons ditch the (1,1) alignment, i.e. the origin of the cubic hexagonal grid system should, by default, coincide with the origin of the xy Cartesian plane. That upstream fix will, I weakly believe, fix this misalignment.
As a final note, I think the Gadfly statistics.jl
code could be made a bit more readable as follows, though (if I'm understanding correctly) the final results won't change at all:
- The call to
cube_round
should have first argumentx - (xmin + xspin/2)
, and likewise fory
. Note the additional parentheses relative to the current code. I.e. a pointx
is mapped to its relative offset from the x-axis midpoint of the data. - The call to
center
should have its last argumentsxmin + xspan/2
and likewise fory
; note the change in sign.
This suggested code and the current code perform equivalent coordinate conversions xy -> hex -> xy; the current code just has an unintuitive-to-me translation by xspan
and yspan
and then undoes that translation.
Also worth noting that I tried locally changing the Hexagons.jl cube_round
behavior to be (0,0)
aligned. Hexagons.jl tests break, but the Gadfly result seems much better:
plot(x = [-10,0,10], y = [-10,0,10], Geom.hexbin(xbincount=4, ybincount=4))
But maybe not all the way better, seems like the xy point (0,0)
, but virtue of being at the mid point of the data, should ideally also correspond to the center of HexagonCubic(0,0,0)
. If the recommended changes to statistics.jl
in Gadfly are made, flipping the signs of the xspan/2
and yspan/2
terms:
@Mattriks @tlnagy thoughts?
Ah, I forgot to comment in this issue after making the upstream PR. The linked Hexagons.jl PR (still awaiting review) seems to fix this issue.