matplotlib
matplotlib copied to clipboard
[Bug]: Hist2d is less sensitive to narrow bins in new version, resulting in different plots
Bug summary
I was redoing old analysis, which earlier has been showing narrow bands in histogram:
(matplotlib 3.3.4)
and when i tried it again, with exactly the same code, the same data the bands are gone:
(matplotlib 3.5.1)
Code for reproduction
axe.hist2d(px, py,range=[[0,2047],[0,30]], bins=[2048,30],cmin=1)
Actual outcome
(matplotlib 3.5.1)
Expected outcome
(matplotlib 3.3.4)
Additional information
No response
Operating system
No response
Matplotlib Version
3.5.1
Matplotlib Backend
No response
Python version
No response
Jupyter version
No response
Installation
No response
Please make a reproducible minimal example. We can't test this as-is.
If you zoom in do the bands come back? Can you also check than np.histogram2d
behaves the same with both numpy versions?
My knee-jerk diagnosis is that either something changed with how we are doing anti-aliasing on pcolormesh or have shifted something in the rendering by a few pixels so that it is hitting the AA code differently. If I am reading that right, there is something like 2-3x as many bins in the x-direction as there pixels available.
@jklymak
minimal code sample:
import matplotlib.pyplot as plt
import numpy as np
size = 2048
px = []
py = []
std=3
cases = 100
for i in range(size):
for j in range(cases):
px.append(i)
m = 10*i/size
if i%32 == 0:
py.append(np.random.normal(1.5*m,std*2))
else:
py.append(np.random.normal(m,std))
fig, axe = plt.subplots(1,1,figsize=(15,7))
axe.hist2d(px, py,range=[[0,2047],[0,30]], bins=[2048,30],cmin=1)
matplotlib 3.3.4
matplotlib 3.5.1
@tacaswell
If you zoom in do the bands come back? Can you also check than
np.histogram2d
behaves the same with both numpy versions?
No they do not, at least not with png and in notebook. That would actually be hard for me to do right now, but i think this can be excluded because:
My knee-jerk diagnosis is that either something changed with how we are doing anti-aliasing on pcolormesh or have shifted something in the rendering by a few pixels so that it is hitting the AA code differently. If I am reading that right, there is something like 2-3x as many bins in the x-direction as there pixels available.
It seems it must be the rendereing since when written to pdf both plots are roughly the same.
I think we just need a FAQ https://matplotlib.org/stable/users/faq/index.html for these sorts of issues. Basically if you have too many pixels we (or any rendering engine) have to decide which ones to show. We don't anti-alias pcolormesh cells like we do imshow.
@jklymak I literally wouldn't have published a few papers If I would see the version from new matplotlib. I don't know how this can be solved within the code, but i would strongly suggest going back to the previous rendering, since:
- The new way hurts reproducibility of plots with matplotlib
- Very crutial things on the plot can be missed in the way that the new plots are made.
- AFAIK the anti-aliasing is for the plots to look nice, and if i would care for them looking nice i would opt for it, but good representation of the underlying data should take precedence.
It doesn't help matplotlib to be serious plotting library missing information from a plot because of a rendering issue (which wasn't there before).
The contention is you had a rendering issue in both versions, just that it was different.
For instance in the above, at least on my screen, the 3.5.1 version has more detail than the 3.3.4 version. Maybe you got them backwards, but regardless, you ideally will have two pixels for each bin, which for your plot means at least 275 dpi (4096 dots /15 inches). And practically you need more because your axes don't reach both sides of the figure, so round up to 300 dpi. Anything less and you will get aliasing.
If you use a pdf viewer it will anti-alias for you and do a better job of dealing with the singletons by averaging visual pixels. We do a version of that for imshow (https://matplotlib.org/stable/gallery/images_contours_and_fields/image_antialiasing.html) but we do not do that for pcolormesh rasters.
Did we change something about drawing edges by default on pcolormesh?
Maybe? All I remember us working of for pcolormesh was the shading, which shouldn't affect this...
Snapping was changed, so maybe something related to that? https://github.com/matplotlib/matplotlib/pull/16090
@mmajewsk try setting snap=False to get the old (still aliased) behaviour.
I can confirm that snap=False
gives the 3.3.4 behaviour.
The fundamental issue still remains that pcolormesh (or hist2d
) with more bins than pixels is going to be aliased. I strongly recommend that folks who want to do raster outputs save in a quite high dpi and then reduce visually using an external package. Theoretically Matplotlib could do that, but currently we do not. For it to work properly you need the whole image to be composed in high definition and no matter what high definition you use, it will be too low for someone's plot.
I'm going to close this, because I don't think there is much we can do about this issue...