nbsphinx
nbsphinx copied to clipboard
Convert directly to docutils without intermediate RST
Currently there are some markdown formatting difficulties that can arise when converting to HTML. For example, a link like [
nbsphinx](https://github.com/spatialaudio/nbsphinx)
which should render as nbsphinx
(cf. nbsphinx) won't be displayed properly because this text can't easily be represented in the intermediate RST.
Since sphinx
can easily produce documents directly from markdown (e.g. using recommonmark
), it would be nice if it were possible to skip the RST altogether as this would more reliably reproduce the desired formatting from the notebook. Currently I am using nbconvert
to get markdown output and then producing HTML with Sphinx instead of using nbsphinx
because of such formatting issues.
cc: @fperez @takluyver
Thanks @bnaul . Also pinging @willingc, who works quite a bit on our own docs.
Thanks @takluyver. @bnaul If you have a sample notebook, I would be happy to look at it. In general, there are pros and cons with different conversion workflows. FWIW, we've been using recommonmark
and nbsphinx
with good success that past few months on Jupyter projects. Also pinging @mgeier for input.
@bnaul Another thing to look at is the exact versions that you are using in Sphinx, nbsphinx, and recommonmark. Certain combinations have more issues than others.
To clarify, Fernando and I were just talking to Brett in person, and asked him to open this issue. While there might be workarounds for any specific case, that's a game of whack-a-mole: the non-code parts of notebooks are markdown, so translating to rst before loading notebooks into Sphinx always runs the risk of messing something up.
We suggested that maybe nbsphinx should depend on recommonmark or something similar, and cut the rst stage out of the loading process. I may be able to spare some time to look into this, though I don't know much (anything?) about the Sphinx API.
Ahh...thanks for the context @takluyver. I would be happy to have a direct conversion from markdown in notebook cells to HTML. The less conversions the better.
Thanks @bnaul for bringing this up and thanks all for the comments!
Quite some time ago, I read that nested inline markup is not possible with docutils. I thought that this is a general limitation of docutils, but as it turns out now, that it is actually only a limitation of the reST parser!
A few weeks ago, @takluyver already mentioned the possibility of using recommonmark: https://github.com/jupyter/nbconvert/pull/222#issuecomment-184665682. At that time I thought this would only be an implementation detail, but as it turns out, using recommonmark may potentially overcome some fundamental problems found in the current implementation.
I started with the intermediate rst
stage because it was the only way I could get started at all, but from my standpoint today, I think it would be better to drop the intermediate rst
and parse directly into the docutils document structure.
However, experience may show that switching to recommonmark brings more damage than good, but we'll have to try it to find out if that's true.
This will be a lot of work, @takluyver if you are eager to tackle that, feel free to do so. If not, I can give it a try, but it will take a considerable amount of time ...
I wouldn't quite say I'm eager, but I will look into it ;-)
I had another look at the title of this Issue: "Allow conversion via markdown instead of RST" ...
@bnaul If that's literally what you mean, I'm strongly against it!
I'd rather want something like "Convert directly to docutils without intermediate RST".
In this case, nbsphinx
would walk the notebook cells, "manually" convert code cells to the docutils representation and use a library like recommonmark to convert Markdown cells.
@takluyver Was that how you understood it?
@mgeier sure, that's a good description; the main issue in my mind is just allowing markdown to be rendered as faithfully as possible.
Yep, loading it as directly as possible into the docutils representation would clearly be ideal.
I noticed today that including HTML links (<a href="url">txt</a>
) in Markdown fails to render the link as Pandoc doesn't preserve HTML links when converting Markdown to RST. This could be considered a bug in Pandoc, but it's also a +1 in favor of this issue :)
Getting tables and math to work might be hard.
There's at least a rudimentary math extension for recommonmark, but I didn't find any documentation about tables.
It's probably less nerve-wrecking to write a new (and extensible) CommonMark parser than trying to fumble all this into recommonmark ...