change transcript info string format
Right now transcripts use an info string that looks like language:flag:other-flag filename, roughly – e.g. ```unison:added-by-ucm scratch.u.
The first word of the info string is typically used to specify the language of the code sample, and rendered in the class attribute of the code tag. However, this spec does not mandate any particular treatment of the info string. —CommonMark Spec §4.5
The GFM spec says the same thing. Neither seems to define “word”, unfortunately, so it’s hard to say what it means. But, in practice it seems to be basically space-delimited. What that means is that with the current syntax we use in transcripts, we end up with
```unison:added-by-ucm
foo = 2 + 2
```
becoming
<pre>
<code class="language-unison:added-by-ucm">
foo = 2 + 2
</code>
</pre>
when it should contain <code class="language-unison">.
Despite GitHub not yet recognizing Unison, other places (e.g., my editor) can, but don’t highlight unison:added-by-ucm blocks, but my editor does highlight ```unison added-by-ucm blocks.
I propose that we change the syntax. The minimal change would be to replace the first : with a space, so ```unison added-by-ucm scratch.u … but there are already cases where we have the file name, but no flags provided, which would be ambiguous. So I wonder if we can come up with a more consistent set of flags that are amenable to Unison adding more in the future.
There are related changes suggested in #5199, in terms of what flags should exist.
It seems like there’s a light consensus toward something like
```ucm {hide=all}
scratch/main> builtins.merge
```
```unison {error=intentional file="skritch.u"}
foo = bar
```
A starting point for discussing the set of options:
resultcan be applied to any code block (currentlyapidoesn’t support it) and has the valuessuccess(the default)errorfailure(for #5350)ignored(calledskipin https://github.com/unisonweb/unison/issues/5199#issuecomment-2223658861)
hidecan be applied to any code block (currentlyapidoesn’t support it) and has the valuesnone(the default)outputall
filecan only be applied tounisonand is an arbitrary string (the quotes are optional, but necessary if there are spaces)generatedhas a boolean value (default isfalse) and corresponds to:added-by-ucm
In #5394, I added both :failure and :incorrect to the set of info string tags. It made me wonder if perhaps result in the proposal above should be split into two othogonal options:
expectError:bool(default isfalse)isFailing:bool(default isfalse)
So,
{isFailing=true}– stanza currently errors, but shouldn’t be{expectError=true}– stanza is intentionally erroring{expectError=true, isFailing=true}– stanza is expected to error, but currently isn’t{}(or no info attributes) – this is a standard successful stanza
This doesn’t quite mesh with the ignored result type, though – these attributes are just meaningless in that context.
To get back to basics, all that is really required for this is inserting a space, so that
``` unison:hide:error file
becomes
``` unison :hide:error file
We can make the space optional to start, which is a trivial change to the parser.
I still think a richer syntax would be good, but not necessary to fix the CommonMark issue.
(This is also another place we could add a warning – that the old syntax is deprecated.)
🤦🏼 I just realized that putting a space there is already allowed, just not shown or documented. The docs should definitely prfer the version with a space.
There are still some inconsistencies – like you can have a space in :hide :error in unison, but not in ucm. And you can never have a space in :hide:all. Plus the order is fixed – :hide has to come before :error.
How about
``` unison {shouldError} fails intentionally
``` unison {broken} fails but shouldn't fail
``` unison {shouldError, broken} should fail but doesn't fail
?
or even error / broken
Reading #5199 I wasn't clear on when you'd want ignore (and depending on the answer, whether that could be handled differently)
@sellout This is done, right?
The “necessary” part is done – allowing (and preferring) a space between the language and other tags.
What still is incomplete is
- allowing tags to be specified in any order and
- possibly changing the format.
The first one probably should just be a separate ticket.
For the second, the motivation is much weaker now than it was before, so probably worth closing and waiting for a better reason to change it (e.g., if some info string structure gets standardized or some tooling has built-in support for some other format).
Ok, #5494 now covers allowing tags to be in any order.
And since we don’t have real motivation to change the tag format for now, this can be closed.