mpldatacursor
mpldatacursor copied to clipboard
Very hard to pick individual pixels on seaborn heatmaps (pcolormesh)
Here's my test code:
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from mpldatacursor import datacursor
fig, ax = plt.subplots(figsize=(15, 10))
arr = np.random.rand(200, 200)
sns.heatmap(np.flip(arr.transpose(), axis=0), cmap="cubehelix", ax=ax)
datacursor()
plt.show()
It seems to work ok at default zoom, but when zoomed in, the graph almost never responds to mouse clicks, it's really hard to pick a point you want. Also, the arrow is inaccurate. As you can see in the screenshot, the arrow is pointing to a white pixel, but the z value is showing 0.5. Also, if a pixel is picked in the default zoom it's showing the wrong z value. You can see this when zoomed out, but it's especially obvious once you zoom in, you'll see the z value is not correct.
This is on a 16 core AMD Threadripper machine with 64GB RAM.
Are you using matplotlib inside a notebook, by chance?
The matplotlib notebook backend has quite a few performance drawbacks and can have odd rendering in some browsers. There's not really much that can be done about that from the mpldatacursor
side, unfortunately.
If you're not in a notebook, do you know which matplotlib backend you're using?
What's happening (notebook or not) is that things are being refreshed too frequently. Hover mode in matplotlib normally checks if it's drawn recently and only updates if it hasn't, but that can break in some backends. I can't directly reproduce the issue locally, but that certainly doesn't mean it's not there, just that it's not there with the matplotlib backends I can test.
My script is exactly as written, run from command line in python 3.6. So I assume I'm using the default backend. The performance is the same in notebook with %matplotlib notebook.
You can't repro the accuracy issue, either?
I can't test in python 3.8 because I get an exception:
Traceback (most recent call last):
File "C:\Users\Steve\AppData\Local\Programs\Python\Python38\lib\tkinter\__init__.py", line 1883, in __call__
return self.func(*args)
File "C:\Users\Steve\AppData\Local\Programs\Python\Python38\lib\site-packages\matplotlib\backends\_backend_tk.py", line 293, in button_press_event
FigureCanvasBase.button_press_event(
File "C:\Users\Steve\AppData\Local\Programs\Python\Python38\lib\site-packages\matplotlib\backend_bases.py", line 1854, in button_press_event
self.callbacks.process(s, mouseevent)
File "C:\Users\Steve\AppData\Local\Programs\Python\Python38\lib\site-packages\matplotlib\cbook\__init__.py", line 229, in process
self.exception_handler(exc)
File "C:\Users\Steve\AppData\Local\Programs\Python\Python38\lib\site-packages\matplotlib\cbook\__init__.py", line 81, in _exception_printer
raise exc
File "C:\Users\Steve\AppData\Local\Programs\Python\Python38\lib\site-packages\matplotlib\cbook\__init__.py", line 224, in process
func(*args, **kwargs)
File "C:\Users\Steve\AppData\Local\Programs\Python\Python38\lib\site-packages\mpldatacursor\datacursor.py", line 718, in _select
self(new_event)
File "C:\Users\Steve\AppData\Local\Programs\Python\Python38\lib\site-packages\mpldatacursor\datacursor.py", line 235, in __call__
self._show_annotation_box(event)
File "C:\Users\Steve\AppData\Local\Programs\Python\Python38\lib\site-packages\mpldatacursor\datacursor.py", line 275, in _show_annotation_box
self.update(event, annotation)
File "C:\Users\Steve\AppData\Local\Programs\Python\Python38\lib\site-packages\mpldatacursor\datacursor.py", line 575, in update
annotation.set_text(self.formatter(**info))
File "C:\Users\Steve\AppData\Local\Programs\Python\Python38\lib\site-packages\mpldatacursor\datacursor.py", line 348, in _formatter
x = self._format_coord(x, ax.xaxis)
File "C:\Users\Steve\AppData\Local\Programs\Python\Python38\lib\site-packages\mpldatacursor\datacursor.py", line 413, in _format_coord
return formatter.pprint_val(x)
AttributeError: 'ScalarFormatter' object has no attribute 'pprint_val'
There is no "default backend", for what it's worth. If nothing is specified, matplotlib chooses based on the OS and what packages are available.
From the stacktraces, it looks like you're using TkAgg. That's what I'm using locally, as well, but I can't reproduce the issue. The same code works perfectly with no lag, zoomed in or not. However, I don't have any way to access a Windows machine, so there could be some OS-specific things at play. I'll try to see if I can find a friend with access to Windows to try to reproduce.
FWIW, the error you posted is due to changes in the most recent version of matplotlib. It has been fixed in master.
Can't test on mac due to same bug. When will the fix be available though pip install?
I did pip uninstall and used setup.py install from master. The behavior is still the same as reported on windows python 3.8 and mac python 3.7. It's almost impossible to get a click to register once zoomed in, and the z values don't always reflect what the arrow is pointing at. Sometimes it gets into a state where it doesn't respond to any left flicks. Right click will make the box go away, but no amount of left clicks will bring up the box again at any zoom level.
@joferkington Have you been able to repro the responsiveness and the accuracy problems?
I still haven't been able to reproduce the issues you're seeing, unfortunately. I've gotten access to a Windows laptop and tested things there. Everything seems extremely responsive for me. However, for various reasons (not my machine and didn't want to install anything invasive), I haven't tested on Windows or MacOS with the latest version of matplotlib, which could explain the difference.
Windows is not necessary, as my repro is exactly the same on Mac. You're sure you're running my script as written from command line and zooming to about that zoom level? And you get accurate values? I'm on matplotlib 20.2.3. Can any other packages affect it? Here's some package versions:
Requirement already satisfied: seaborn in c:\users\steve\appdata\local\programs\python\python38\lib\site-packages (0.11.0)
Requirement already satisfied: matplotlib>=2.2 in c:\users\steve\appdata\local\programs\python\python38\lib\site-packages (from seaborn) (3.3.2)
Requirement already satisfied: numpy>=1.15 in c:\users\steve\appdata\local\programs\python\python38\lib\site-packages (from seaborn) (1.19.1)
Requirement already satisfied: scipy>=1.0 in c:\users\steve\appdata\local\programs\python\python38\lib\site-packages (from seaborn) (1.5.2)
Requirement already satisfied: pandas>=0.23 in c:\users\steve\appdata\local\programs\python\python38\lib\site-packages (from seaborn) (1.1.2)
Requirement already satisfied: pillow>=6.2.0 in c:\users\steve\appdata\local\programs\python\python38\lib\site-packages (from matplotlib>=2.2->seaborn) (7.2.0)
Requirement already satisfied: python-dateutil>=2.1 in c:\users\steve\appdata\local\programs\python\python38\lib\site-packages (from matplotlib>=2.2->seaborn) (2.8.1)
Requirement already satisfied: certifi>=2020.06.20 in c:\users\steve\appdata\local\programs\python\python38\lib\site-packages (from matplotlib>=2.2->seaborn) (2020.6.20)
Requirement already satisfied: kiwisolver>=1.0.1 in c:\users\steve\appdata\local\programs\python\python38\lib\site-packages (from matplotlib>=2.2->seaborn) (1.2.0)
Requirement already satisfied: pyparsing!=2.0.4,!=2.1.2,!=2.1.6,>=2.0.3 in c:\users\steve\appdata\local\programs\python\python38\lib\site-packages (from matplotlib>=2.2->seaborn) (2.4.7)
Requirement already satisfied: cycler>=0.10 in c:\users\steve\appdata\local\programs\python\python38\lib\site-packages (from matplotlib>=2.2->seaborn) (0.10.0)
Requirement already satisfied: pytz>=2017.2 in c:\users\steve\appdata\local\programs\python\python38\lib\site-packages (from pandas>=0.23->seaborn) (2020.1)
Requirement already satisfied: six>=1.5 in c:\users\steve\appdata\local\programs\python\python38\lib\site-packages (from python-dateutil>=2.1->matplotlib>=2.2->seaborn) (1.15.0)
It's perfectly responsive for me with exactly that code yes. Especially when zoomed in.
However, note that sns.heatmap
plots things with pcolormesh
which means that matplotlib only considers clicks on the edges of cells as valid. That's an underlying matplotlib restriction.
As a result, if you click the center of a cell, it won't register.
If you plot with imshow
(which is much better suited to your use case) instead of pcolormesh
(which isn't meant for large regular arrays that can be displayed with imshow), clicking the centers of the cells would work fine.
To get a sense of what I'm talking about, try using imshow
. Notice that it will trigger anywhere within the cell, not just at the edges:
import numpy as np
import matplotlib.pyplot as plt
from mpldatacursor import datacursor
fig, ax = plt.subplots(figsize=(15, 10))
arr = np.random.rand(200, 200)
ax.imshow(arr, cmap='cubehelix', interpolation='none')
datacursor()
plt.show()
Ok, well that's the issue. Your sample works well. sns "works" if I select very close to a pixel edge. But you have to have secret knowledge to know which edge pertains to which pixel (even if the arrow points to the right side of an edge, it will give you the left pixel value), Seems dumb that matplotlib can't pick inside a mesh polygon.
Anyways, problem solved. Thanks.
@steel3d - For what it's worth, I've considered hacking around the limitations of pcolormesh
(or, more accurately, the QuadMesh
artist) interactions in the past so that logical cells (i.e. what you see) can be selected instead of edges.
It's not impossible, but it's better addressed with changes to matplotlib, rather than changes to mpldatacursor.
The last time I dug into it, it was easy to do for simple rectangular cases, but harder to do for the generic cases that QuadMesh
supports. I don't know that I'll have time in the near future to look back into the issue, but I could give you or someone else an overview of what the changes might look like, if you wanted to tackle it or open a feature request ticket for matplotlib.
Sorry, I don't even have time to work on my own stuff, not to mention stuff like this :) Hopefully at least this will save others some time, now that it's a known issue.