jupyter-book icon indicating copy to clipboard operation
jupyter-book copied to clipboard

LaTeX in pandas DataFrame does not render

Open kyleniemeyer opened this issue 4 years ago • 6 comments

Describe the problem

Pandas DataFrames support LaTeX formatting in column labels, and this is rendered properly in Jupyter Notebooks, but Jupyter Book does not seem to render this in the final HTML properly.

I've tried using various string formats, including $$f$$, $f$, r$f$, but none of those seem to work. Is there perhaps some additional option/argument I am missing?

Link to your repository or website

https://kyleniemeyer.github.io/gas-dynamics-notes/compressible-flows/isentropic.html#isentropic-relations

Steps to reproduce

  1. Create a Pandas DataFrame with LaTeX formatting in a column label
  2. Build the JupyterBook

The version of Python you're using

No response

Your operating system

No response

Versions of your packages

Jupyter Book      : 0.11.3
External ToC      : 0.2.3
MyST-Parser       : 0.13.7
MyST-NB           : 0.12.3
Sphinx Book Theme : 0.1.5
Jupyter-Cache     : 0.4.3
NbClient          : 0.5.4
``

### Additional context

_No response_

kyleniemeyer avatar Oct 12 '21 18:10 kyleniemeyer

thanks for reporting this @kyleniemeyer.

@AakashGfude @choldgraf @chrisjsewell might this be a theme issue? Looking at the html from the above example site should mathjax be parsing anything surrounded by $?

<div class="output text_html"><style type="text/css">
</style>
<table id="T_49793_">
  <thead>
    <tr>
      <th class="col_heading level0 col0">$M$</th>
      <th class="col_heading level0 col1">$p/p_t$</th>
      <th class="col_heading level0 col2">$T/T_t$</th>
      <th class="col_heading level0 col3">$A/A^*$</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td id="T_49793_row0_col0" class="data row0 col0">0.00</td>
      <td id="T_49793_row0_col1" class="data row0 col1">1.000000</td>
      <td id="T_49793_row0_col2" class="data row0 col2">1.000000</td>
      <td id="T_49793_row0_col3" class="data row0 col3">inf</td>
    </tr>
...
</table>
</div>

mmcky avatar Oct 15 '21 04:10 mmcky

In case someone else runs into this, I did come up with a workaround, although it isn’t especially pretty. The idea is to force MyST-Parser to convert all the dollar-delimited math into pure-HTML-formatted math. The result isn’t as nice as MathJax formatting, but already much better than leaving TeX code in my tables.

It should be possible to tweak this to convert to MathJax math, although that might require more complex configuration of the parser.

Anyway, usage is pretty simple. If your dataframe is named df, then doing

display_dataframe_with_math(df)

in a Jupyter Notebook will create an output cell with a rendered dataframe, which also displays correctly when you build the Jupyter Book.

To use this you will need to stick the following definitions somewhere:

def display_dataframe_with_math(df, raw=False):
    import re
    from IPython.display import HTML
    
    html = df.to_html()    
    raw_html = re.sub(r"\$.*?\$", lambda m: convert_tex_to_html(m[0], raw=True), html)
    return raw_html if raw else HTML(raw_html)

def convert_tex_to_html(html, raw=False):
    import io
    import os
    import re
    import pandas as pd
    from textwrap import dedent
    from tempfile import NamedTemporaryFile
    from contextlib import redirect_stdout
    from IPython.display import HTML
    from myst_parser.parsers.docutils_ import cli_html
    
    # Manually apply the MyST parser to convert $-$ into MathJax’s HTML code
    frontmatter="""
    ---
    myst:
      enable_extensions: [dollarmath, amsmath]
    ---
    """
    with NamedTemporaryFile('w', delete=False) as f:
        f.write(dedent(frontmatter).strip())
        f.write("\n\n")
        f.write(html)
    with redirect_stdout(io.StringIO()) as sf:
        cli_html([f.name])
    fullhtml = sf.getvalue()  # Returns a large-ish HTML with the full styling header
    os.remove(f.name)
    # Strip HTML headers to keep only the body with converted math
    m = re.search(r'<body>\n<div class="document">([\s\S]*)</div>\n</body>', fullhtml)
    raw_html = m[1].strip()
    # Special case: if we provided a snippet with no HTML markup at all, don’t wrap the result
    # in <p> tags
    if "\n" not in html and "<" not in html:
        m = re.match(r"<p>(.*)</p>", raw_html)
        if m:  # Match was a success: there are <p> tags we can remove
            raw_html = m[1]
    # Manually display the result as HTML
    return raw_html if raw else HTML(raw_html)

alcrene avatar Sep 15 '23 14:09 alcrene

@mmcky do you know if there has been any progress on this issue? We encountered it recently and were wondering if there was a fix as we would very much like to use pandas dataframes

RemDelaporteMathurin avatar Jul 12 '24 11:07 RemDelaporteMathurin

We encounter a similar problem here. Any more updates about this issue? Thanks.

shenvitor avatar Jul 25 '24 09:07 shenvitor

@mmcky do you know if there has been any progress on this issue? We encountered it recently and were wondering if there was a fix as we would very much like to use pandas dataframes

No fix to this that I know of yet. I've been snowed under with other work. @rowanc1 do you know if this is an issue in the mystmd ecosystem?

mmcky avatar Jul 26 '24 06:07 mmcky

This seems to be an upstream problem, in myst-nb or even higher up. One hack is to use DataFrame.to_markdown() and Markdown:

from IPython.display import Markdown

Markdown(df.to_markdown())

You will need to set nb_render_markdown_format though, in _config.yml

sphinx:
  config:
    nb_render_markdown_format: myst

or in conf.py

myst_enable_extensions = [
    "dollarmath",
]
nb_render_markdown_format = "myst"

A minimal test repo can be found here: https://github.com/redeboer/test-sphinx-pandas-dataframe-rendering

redeboer avatar Aug 21 '24 09:08 redeboer