nbsphinx Convert directly to docutils without intermediate RST

Currently there are some markdown formatting difficulties that can arise when converting to HTML. For example, a link like [nbsphinx](https://github.com/spatialaudio/nbsphinx) which should render as nbsphinx (cf. nbsphinx) won't be displayed properly because this text can't easily be represented in the intermediate RST.

Since sphinx can easily produce documents directly from markdown (e.g. using recommonmark), it would be nice if it were possible to skip the RST altogether as this would more reliably reproduce the desired formatting from the notebook. Currently I am using nbconvert to get markdown output and then producing HTML with Sphinx instead of using nbsphinx because of such formatting issues.

cc: @fperez @takluyver

Apr 05 '16 23:04 bnaul

Thanks @bnaul . Also pinging @willingc, who works quite a bit on our own docs.

Apr 06 '16 00:04 takluyver

Thanks @takluyver. @bnaul If you have a sample notebook, I would be happy to look at it. In general, there are pros and cons with different conversion workflows. FWIW, we've been using recommonmark and nbsphinx with good success that past few months on Jupyter projects. Also pinging @mgeier for input.

Apr 06 '16 00:04 willingc

@bnaul Another thing to look at is the exact versions that you are using in Sphinx, nbsphinx, and recommonmark. Certain combinations have more issues than others.

Apr 06 '16 00:04 willingc

To clarify, Fernando and I were just talking to Brett in person, and asked him to open this issue. While there might be workarounds for any specific case, that's a game of whack-a-mole: the non-code parts of notebooks are markdown, so translating to rst before loading notebooks into Sphinx always runs the risk of messing something up.

We suggested that maybe nbsphinx should depend on recommonmark or something similar, and cut the rst stage out of the loading process. I may be able to spare some time to look into this, though I don't know much (anything?) about the Sphinx API.

Apr 06 '16 00:04 takluyver

Ahh...thanks for the context @takluyver. I would be happy to have a direct conversion from markdown in notebook cells to HTML. The less conversions the better.

Apr 06 '16 01:04 willingc

Thanks @bnaul for bringing this up and thanks all for the comments!

Quite some time ago, I read that nested inline markup is not possible with docutils. I thought that this is a general limitation of docutils, but as it turns out now, that it is actually only a limitation of the reST parser!

A few weeks ago, @takluyver already mentioned the possibility of using recommonmark: https://github.com/jupyter/nbconvert/pull/222#issuecomment-184665682. At that time I thought this would only be an implementation detail, but as it turns out, using recommonmark may potentially overcome some fundamental problems found in the current implementation.

I started with the intermediate rst stage because it was the only way I could get started at all, but from my standpoint today, I think it would be better to drop the intermediate rst and parse directly into the docutils document structure.

However, experience may show that switching to recommonmark brings more damage than good, but we'll have to try it to find out if that's true.

This will be a lot of work, @takluyver if you are eager to tackle that, feel free to do so. If not, I can give it a try, but it will take a considerable amount of time ...

Apr 06 '16 18:04 mgeier

I wouldn't quite say I'm eager, but I will look into it ;-)

Apr 06 '16 18:04 takluyver

I had another look at the title of this Issue: "Allow conversion via markdown instead of RST" ...

@bnaul If that's literally what you mean, I'm strongly against it!

I'd rather want something like "Convert directly to docutils without intermediate RST".

In this case, nbsphinx would walk the notebook cells, "manually" convert code cells to the docutils representation and use a library like recommonmark to convert Markdown cells.

@takluyver Was that how you understood it?

Apr 07 '16 12:04 mgeier

@mgeier sure, that's a good description; the main issue in my mind is just allowing markdown to be rendered as faithfully as possible.

Apr 07 '16 16:04 bnaul

Yep, loading it as directly as possible into the docutils representation would clearly be ideal.

Apr 07 '16 16:04 takluyver

I noticed today that including HTML links (<a href="url">txt</a>) in Markdown fails to render the link as Pandoc doesn't preserve HTML links when converting Markdown to RST. This could be considered a bug in Pandoc, but it's also a +1 in favor of this issue :)

Apr 12 '16 22:04 tbekolay

Getting tables and math to work might be hard.

There's at least a rudimentary math extension for recommonmark, but I didn't find any documentation about tables.

It's probably less nerve-wrecking to write a new (and extensible) CommonMark parser than trying to fumble all this into recommonmark ...

Apr 27 '16 15:04 mgeier

nbsphinx nbsphinx copied to clipboard

Convert directly to docutils without intermediate RST

nbsphinx
nbsphinx copied to clipboard