"Content sourcemaps"
The user story isn't crystal clear yet but something I think would be super engaging is to be able to refer to the source when you're reading something.
Suppose you're reading the "Usage notes" on the <video> page and you spot a typo, or you found a bug in a code example, then it would be great to know exactly where the content came from. Some people might want just the file path of the .md file and some might just prefer to go straight to the HTML URL in GitHub.
I'm referring to what Readthedocs does with it's "Edit in GitHub" link for example.
It's an opportunity to engage new contributors but it also needs to be powerful enough for core contributors.
At the moment, technically the renderer has everything it needs. The cli knows the path of the file it picks up. But it's the .json file. So if the filepath is stumptown/packaged/html/reference/elements/caption.json it can replace stumptown/packaged with https://github.com/mdn/stumptown-content/tree/master/content/ and just turn /caption.json into /caption/ so you eventually end up with a URL like:
https://github.com/mdn/stumptown-content/tree/master/content/html/reference/elements/caption
However, it would be nice to do more. For example, next to an example it would be cool to have a link directly to the relevant example in stumptown-content. E.g. this
Truth be told, we could technically do a git ref lookup on the stumptown-content git submodule and from there figure out which exact version we're using. E.g. this but I'm not sure how helpful that is.
So, perhaps we can figure out all the relevant GitHub URLs based on the packaged JSON and present these links in the rendered output somehow. In other words, let's try to do something purely in the renderer first and see if it works out.
Forking https://github.com/mdn/stumptown-content/pull/207#issuecomment-548508388 by @ddbeck
Ideal would be something like this in the built .json files:
{
"type": "prose",
"source": {
"file": "/Users/peterbe/dev/MOZILLA/MDN/stumptown-content/content/html/reference/elements/video/video.md",
"line": 1234,
"column": 23
},
"value": {
"title": "See also",
"id": "see_also",
"content": "<ul>\n<li><a href=\"https://developer.mozilla.org/en-US/docs...
}
}
The renderer would need to be upgraded to something because when doing local dev your use case might be to open that file of yours locally, but in a production build you might want to make a conversion to a github.com URL.
Ping @ddbeck
We just landed this: https://github.com/mdn/stumptown-renderer/pull/220
Now, if you've bothered to set up the EDITOR environment variable, the little debug information about which file you edited is now clickable and opens in your editor.
Let's now do something more with this. To start with, if we could add the absolute filepath in the prose sections in build-json we could incorporate that into the dev server in stumptown-renderer and have any paragraph a "link" that opens the relevant file. We can worry about line number and column number later.
Perhaps we wait with examples because, in some other issue. we're discussing maybe moving them from being its own files to instead be inline in the Markdown.
Also, for development it can use this $EDITOR thing and for production, some day, it can be a way to link to the github.com repo.
Yeah, I can do this. I'll try to get ahead on my sprint tasks early next week and bring my attention back to this. Theoretically, it shouldn't be hard to use to the unified API to get this into build-json.
Don't prioritize this work but don't put it all the way on the bottom of your todo list. The kind of amazing DX (or should I call it WX (Writer Experience)) this can unlock is genuinely exciting for all parts involved.
Naturally, this is a little harder than I first thought. Right now, build-json converts Markdown to HTML and then uses JSDOM to slice up the prose sections. This loses the source information. It's fixable though. Some notes on how we might do that (I still want to come back to this, but you're free to take this on, if you like—I'll max out on time this week soon).
We need to change the sequence of operations a bit:
-
Process Markdown into
remarka tree. -
In that processor, you'll have a little plugin that will slice up the prose into subtrees for each section. I think
mdast-util-heading-rangemight make this relatively easy. Set aside these trees. -
For each section subtree, set aside the position (i.e., the line number etc. of the section heading in the original Markdown) and convert the subtree to HTML (with the
remark-rehypeprocessor). The HTML becomes the packagedvalueproperty and you add a new property (e.g.,source) with thepositionobject of the first node in the section subtree. The built JSON looks like this, perhaps:{ "type": "prose", "value": "<!-- A blob of HTML -->", "source": { "start": { "line": 2, "column": 3, "offset": 4}, "end": "…" } }
This gets us 99% of the way there. The good news is that I think all of this can all be done in slice-prose.js alone, without changing the signature of packageProse. The bad news is that this lacks the path to the file: packageProse only gets a string of Markdown content and it doesn't know anything about the origin file. I haven't thought through how to address that at all.
Don't kill yourself over it. If it could become:
{
"type": "prose",
"value": "<!-- A blob of HTML -->",
"source": {
"url": "https://github.com/mdn/stumptown-content/blob/master/content/en-US/html/reference/elements/address/address.md"
}
}
Or, if process.env.NODE_ENV === 'development' this:
{
"type": "prose",
"value": "<!-- A blob of HTML -->",
"source": {
"file": "/Users/peterbe/dev/MOZILLA/MDN/stumptown-renderer/stumptown/content/en-US/html/reference/elements/address/address.md"
}
}
that in itself would be a big deal and lead to some really interesting ideas.
Hmm, on closer inspection, I need to figure out a different way to handle the Markdown source in build-json. Right now, the Markdown gets cleaved from the frontmatter early on—which would break my proposed line numbers suggestion—and I'd need to add a path argument up and down build-json to pass around a file path. It's a heavy approach and doesn't really generalize well (what if I want additional metadata later? Add more arguments?).
I think a prerequisite here is to switch to using unified to process frontmatter—then I could pass around a vfile or syntax tree with all the metadata. I think I can reuse some code from the linter—I'll revisit this next week.