pandas icon indicating copy to clipboard operation
pandas copied to clipboard

DataFrame.plot doesn't handle axis titles for sharex=row correctly

Open Austrianguy opened this issue 7 years ago • 3 comments

Code Sample, a copy-pastable example if possible

import matplotlib.pyplot as plt
import pandas as pd

_, axs = plt.subplots(2, 3, sharex='row')

for ax in axs.flatten():
    ax.plot(range(5))
    ax.set_xlabel('x-axis title')

ax = axs[1, 0]
data = [2]*5
pd.DataFrame(data).plot(ax=ax) # removes xaxis and yaxis labels except those on the sides of the grid.

plt.tight_layout() # so xaxis label can't hide beneath second plot

Problem description

When sharing axes on subplots only by row or column, the Pandas Dataframe plotting method eliminates all axis titles that are not on the far left or bottom right. This gets especially confusing since doing just one Pandas plot screws up the entire subplot grid. In contrast, the Matplotlib (Axes.plot) and Pyplot (plt.plot) plotting methods don't try to be smart with axis labels at all and leave them where they are.

The problem occurs for all four combinations of sharex, sharey, 'row' and 'col'.

A workaround is to iterate through the axes and set ax.xaxis.label.set_visible(True) which has to be done after the last Pandas operation on any Axes contained in the Figure.

Expected Output

Three options:

  1. Do not modify axis labels to maintain consistency with matplotlib.axes.Axes.plot() and matplotlib.pyplot.plot(). I. e. keep axis labels on all plots.

  2. Only remove x-axis labels on axes shared with an axes beneath. Only remove y-axis labels on axes shared with an axes to the left. a) Do this for all axes in the subplot. b) Only modify axis labels on axes that pandas actually plots to. (Sounds like a nightmare to implement.) c) Include a keyword argument in pandas.Dataframe.plot that allows the user to control whether or not axis labels are modified.

Output of pd.show_versions()

INSTALLED VERSIONS

commit: None python: 2.7.14.final.0 python-bits: 64 OS: Windows OS-release: 10 machine: AMD64 processor: Intel64 Family 6 Model 45 Stepping 7, GenuineIntel byteorder: little LC_ALL: None LANG: en LOCALE: None.None

pandas: 0.23.0 pytest: 3.5.0 pip: 9.0.3 setuptools: 39.0.1 Cython: 0.28.2 numpy: 1.14.2 scipy: 1.0.1 pyarrow: None xarray: None IPython: 5.6.0 sphinx: 1.7.2 patsy: 0.5.0 dateutil: 2.7.2 pytz: 2018.4 blosc: None bottleneck: 1.2.1 tables: 3.4.2 numexpr: 2.6.4 feather: None matplotlib: 2.2.2 openpyxl: 2.5.2 xlrd: 1.1.0 xlwt: 1.3.0 xlsxwriter: 1.0.4 lxml: 4.2.1 bs4: 4.6.0 html5lib: 1.0.1 sqlalchemy: 1.2.6 pymysql: None psycopg2: None jinja2: 2.10 s3fs: None fastparquet: None pandas_gbq: None pandas_datareader: None

Austrianguy avatar Jun 01 '18 23:06 Austrianguy

An earlier change to this feature didn't take into account that 'row' and 'col' can be passed for sharex and sharey. https://github.com/pandas-dev/pandas/issues/9737

Austrianguy avatar Jun 02 '18 00:06 Austrianguy

IMO, option 1 seems best to me.

cc @TomAugspurger

gfyoung avatar Jun 06 '18 07:06 gfyoung

Specifically, it seems to work to loop through the axes after any other operations on the axes have finished:

# assuming 3 columns and 6 rows
for c in range(0,6):
  ax[c, 0].tick_params(axis='both', which='both', labelsize=7, labelbottom=True)
  ax[c, 1].tick_params(axis='both', which='both', labelsize=7, labelbottom=True)
  ax[c, 2].tick_params(axis='both', which='both', labelsize=7, labelbottom=True)
  ax[c, 0].xaxis.label.set_visible(True)
  ax[c, 1].xaxis.label.set_visible(True)
  ax[c, 2].xaxis.label.set_visible(True)

davidshumway avatar Sep 18 '22 21:09 davidshumway