Gadfly.jl icon indicating copy to clipboard operation
Gadfly.jl copied to clipboard

Default settings with white borders can hide points in dense plots

Open AlexRobson opened this issue 5 years ago • 5 comments

Having just switched to experimenting with Gadfly, I was confused by the apparent behaviour in decreasing opacity of the plots as the number of points became more dense.

Essentially, this would produce something expected: plot(y=sort(rand(100,)))

where this would produce a barely visible image:

plot(y=sort(rand(1000,)))

This appears to be a related issue: https://github.com/GiovineItalia/Gadfly.jl/issues/618

After understanding the issue, something like the above can be plotted like this: plot(y=sort(rand(1000,)), Geom.point, Theme(highlight_width=0.0mm))

however ideally this, or some equivalent, would be the default behaviour to avoid confusion in cases such as this.

AlexRobson avatar Mar 20 '19 15:03 AlexRobson

In your plot above, the Theme(discrete_highlight_color= ) default is a function that returns white: the changing opacity is an illusion. Try adding Theme(discrete_highlight_color=identity). And see this example.

The alpha functionality in that example is new: http://gadflyjl.org/dev/gallery/scales/. You can use it now by doing add Gadfly#master.

Mattriks avatar Mar 20 '19 21:03 Mattriks

making a change like this would be breaking and would require a major version number bump. not that that should stop us, but we do want to be careful about breaking people's plots.

also, usually when the number of points plotted gets so high that this is a problem, it's an indication that one should be using a different geometry. like a density or contour plot for example.

i've added the 2.0 milestone so we can think about this further as the time approaches. thanks for the feedback!

bjarthur avatar Mar 22 '19 12:03 bjarthur

AFAIK, we're one of the only plotting libraries that has the default white border approach.

also, usually when the number of points plotted gets so high that this is a problem, it's an indication that one should be using a different geometry. like a density or contour plot for example.

This is my thought as well.

tlnagy avatar Mar 22 '19 20:03 tlnagy

Perhaps a community poll? Here are some suggestions (which users can test):

  1. Theme(discrete_highlight_color=c->"white")
  2. Theme(discrete_highlight_color=c->nothing)
  3. Theme(discrete_highlight_color=identity)
  4. Other suggestions?

Probably better to do a poll via Julia discourse. We should agree here on the possible choices first!

Mattriks avatar Mar 22 '19 23:03 Mattriks

For context, in my case, it was in exploring some time-series data of length O(1000). Regarding the comment on using different geometries, I don't doubt that that in my case there may be better representations of the data [I've switched to using Gadfly pretty recently, and I'm still understanding the different semantics]. My issue in this case was down to some brief initial confusion on why this behaviour was occuring. Still, it seems that this could be encountered by others similarly new to the package, especially for time-series type data, so fwiw from me a change to the defaults would be welcome (at least, discussed out for a future milestone), potentially more in line with the other plotting libraries, so that behaviour is a bit more expected while starting off with the package.

Thanks for your responses! :)

AlexRobson avatar Mar 23 '19 12:03 AlexRobson