pandas icon indicating copy to clipboard operation
pandas copied to clipboard

BUG: `to_latex()` does not handle braces in new headers gracefully

Open GregersSR opened this issue 1 year ago • 1 comments

Pandas version checks

  • [X] I have checked that this issue has not already been reported.

  • [X] I have confirmed this bug exists on the latest version of pandas.

  • [X] I have confirmed this bug exists on the main branch of pandas.

Reproducible Example

import pandas as pd

df = pd.DataFrame({"a": [1, 2]})
result = df.to_latex(header=[r"$\bar{y}$"])

Issue Description

This gives a KeyError. I looked into the reason, and it happens because at some point r"$\bar{y}$".format(x) is called, and the braces are interpreted as string interpolation.

Expected Behavior

I expect one of the following:

  1. An error message saying that single braces should be escaped using {{
  2. Documentation stating the same thing
  3. That the single brace is not interpreted as string interpolation

Installed Versions

pd.show_versions()

INSTALLED VERSIONS

commit : 2a10e04a099d5f1633abcdfbb2dd9fdf09142f8d python : 3.10.15 python-bits : 64 OS : Linux OS-release : 6.8.0-45-generic Version : #45-Ubuntu SMP PREEMPT_DYNAMIC Fri Aug 30 12:02:04 UTC 2024 machine : x86_64 processor : x86_64 byteorder : little LC_ALL : None LANG : en_US.UTF-8 LOCALE : en_US.UTF-8

pandas : 3.0.0.dev0+1579.g2a10e04a09 numpy : 1.26.4 dateutil : 2.9.0 pip : 24.2 Cython : 3.0.11 sphinx : 8.1.3 IPython : 8.28.0 adbc-driver-postgresql: None adbc-driver-sqlite : None bs4 : 4.12.3 blosc : None bottleneck : 1.4.1 fastparquet : 2024.5.0 fsspec : 2024.9.0 html5lib : 1.1 hypothesis : 6.115.2 gcsfs : 2024.9.0post1 jinja2 : 3.1.4 lxml.etree : 5.3.0 matplotlib : 3.9.2 numba : 0.60.0 numexpr : 2.10.1 odfpy : None openpyxl : 3.1.5 psycopg2 : 2.9.9 pymysql : 1.4.6 pyarrow : 17.0.0 pyreadstat : 1.2.7 pytest : 8.3.3 python-calamine : None pytz : 2024.2 pyxlsb : 1.0.10 s3fs : 2024.9.0 scipy : 1.14.1 sqlalchemy : 2.0.36 tables : 3.10.1 tabulate : 0.9.0 xarray : 2024.9.0 xlrd : 2.0.1 xlsxwriter : 3.2.0 zstandard : 0.23.0 tzdata : 2024.2 qtpy : None pyqt5 : None

GregersSR avatar Oct 17 '24 09:10 GregersSR

The versions included above are from a development environment. I would like to contribute a PR for this, but I want to know if you prefer a change of behavior or a change of documentation. The issue is also reproducible in a Conda-packaged version, which is where I first encountered it. Let me know if you would like version details for that environment.

GregersSR avatar Oct 17 '24 09:10 GregersSR

take

timapage avatar Oct 20 '24 19:10 timapage

This should fix your problem:

df = pd.DataFrame({"a": [1, 2]})
s = df.style
s.relabel_index([r"$\bar{{y}}$"], axis="columns")
print(s.to_latex())


\begin{tabular}{lr}
 & $\bar{y}$ \\
0 & 1 \\
1 & 2 \\
\end{tabular}

I agree the error message could be better, but this is unlikely to be the only cause of this error message appearing, therefore making a change may also be errorneous because we may omit and mislead other fundamental causes.

Documentation of the issue is probably the best route I suspect,

attack68 avatar Oct 21 '24 06:10 attack68

That makes sense. I had also figured out that using df.rename(...) was a workaround. I will open a documentation MR and add that any use of { or } must be escaped by {{ or }}.

GregersSR avatar Oct 21 '24 08:10 GregersSR

I've encountered a similar issue using Styler.relabel_index. The documentation actually mentions "using curly brackets (or double curly brackets if the string if pre-formatted)", but I agree it could be more clear.

yuanx749 avatar Oct 21 '24 10:10 yuanx749