unison change transcript info string format

Right now transcripts use an info string that looks like language:flag:other-flag filename, roughly – e.g. ```unison:added-by-ucm scratch.u.

The first word of the info string is typically used to specify the language of the code sample, and rendered in the class attribute of the code tag. However, this spec does not mandate any particular treatment of the info string. —CommonMark Spec §4.5

The GFM spec says the same thing. Neither seems to define “word”, unfortunately, so it’s hard to say what it means. But, in practice it seems to be basically space-delimited. What that means is that with the current syntax we use in transcripts, we end up with

```unison:added-by-ucm
foo = 2 + 2
```

becoming

<pre>
  <code class="language-unison:added-by-ucm">
    foo = 2 + 2
  </code>
</pre>

when it should contain <code class="language-unison">.

Despite GitHub not yet recognizing Unison, other places (e.g., my editor) can, but don’t highlight unison:added-by-ucm blocks, but my editor does highlight ```unison added-by-ucm blocks.

I propose that we change the syntax. The minimal change would be to replace the first : with a space, so ```unison added-by-ucm scratch.u … but there are already cases where we have the file name, but no flags provided, which would be ambiguous. So I wonder if we can come up with a more consistent set of flags that are amenable to Unison adding more in the future.

Jul 11 '24 20:07 sellout

There are related changes suggested in #5199, in terms of what flags should exist.

Jul 11 '24 20:07 sellout

It seems like there’s a light consensus toward something like

```ucm {hide=all}
scratch/main> builtins.merge
```

```unison {error=intentional file="skritch.u"}
foo = bar
```

A starting point for discussing the set of options:

result can be applied to any code block (currently api doesn’t support it) and has the values
- success (the default)
- error
- failure (for #5350)
- ignored (called skip in https://github.com/unisonweb/unison/issues/5199#issuecomment-2223658861)
hide can be applied to any code block (currently api doesn’t support it) and has the values
- none (the default)
- output
- all
file can only be applied to unison and is an arbitrary string (the quotes are optional, but necessary if there are spaces)
generated has a boolean value (default is false) and corresponds to :added-by-ucm

Oct 03 '24 05:10 sellout

In #5394, I added both :failure and :incorrect to the set of info string tags. It made me wonder if perhaps result in the proposal above should be split into two othogonal options:

expectError: bool (default is false)
isFailing: bool (default is false)

So,

{isFailing=true} – stanza currently errors, but shouldn’t be
{expectError=true} – stanza is intentionally erroring
{expectError=true, isFailing=true} – stanza is expected to error, but currently isn’t
{} (or no info attributes) – this is a standard successful stanza

This doesn’t quite mesh with the ignored result type, though – these attributes are just meaningless in that context.

Oct 07 '24 16:10 sellout

To get back to basics, all that is really required for this is inserting a space, so that

``` unison:hide:error file

becomes

``` unison :hide:error file

We can make the space optional to start, which is a trivial change to the parser.

I still think a richer syntax would be good, but not necessary to fix the CommonMark issue.

(This is also another place we could add a warning – that the old syntax is deprecated.)

Oct 07 '24 20:10 sellout

🤦🏼 I just realized that putting a space there is already allowed, just not shown or documented. The docs should definitely prfer the version with a space.

There are still some inconsistencies – like you can have a space in :hide :error in unison, but not in ucm. And you can never have a space in :hide:all. Plus the order is fixed – :hide has to come before :error.

Oct 07 '24 21:10 sellout

How about ``` unison {shouldError} fails intentionally

``` unison {broken} fails but shouldn't fail

``` unison {shouldError, broken} should fail but doesn't fail ? or even error / broken

Reading #5199 I wasn't clear on when you'd want ignore (and depending on the answer, whether that could be handled differently)

Oct 11 '24 16:10 aryairani

@sellout This is done, right?

Dec 07 '24 18:12 aryairani

The “necessary” part is done – allowing (and preferring) a space between the language and other tags.

What still is incomplete is

allowing tags to be specified in any order and
possibly changing the format.

The first one probably should just be a separate ticket.

For the second, the motivation is much weaker now than it was before, so probably worth closing and waiting for a better reason to change it (e.g., if some info string structure gets standardized or some tooling has built-in support for some other format).

Dec 07 '24 19:12 sellout

Ok, #5494 now covers allowing tags to be in any order.

And since we don’t have real motivation to change the tag format for now, this can be closed.

Dec 10 '24 00:12 sellout