rfcs icon indicating copy to clipboard operation
rfcs copied to clipboard

RFC: Add descriptive names to doctests

Open casey opened this issue 3 years ago • 23 comments

I wasn't sure if this needed an RFC or not. It's a rather minor feature, however, it is a user-visible change, and since I've already written one, I thought I'd open a PR. Feel free to close it if it doesn't require an RFC.

I did some supporting work on this in https://github.com/rust-lang/rust/pull/78429, which made code block info string parsing only split on , and whitespace, so that an attribute of the form name=foo would be possible, but never followed up with the actual RFC.

This RFC proposes the ability to annotate documentation test code blocks with metadata of the form name=NAME. The given name will be used to identify the documentation test when it is run by the test runner, in addition to the current information of test binary, module, and line number.

For example, this documention test:

```name=arithmetic
assert_eq!(2 + 2, 4);
```

Assuming that it is in crate foo, in file src/lib.rs, in module bar, on line 37, will produce this output when run:

   Doc-tests foo

running 1 test
test src/lib.rs - bar::arithmetic (line 37) ... ok

test result: ok. 1 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out

Rendered

casey avatar Sep 04 '22 19:09 casey

Updated and addressed comments. Thank you for the feedback!

  • Mentioned test filtering
  • Remove stray note that NAME be a valid identifier
  • Rename IDENT to NAME

A previous version of this RFC required that doctest names be valid rust identifiers. I think this might be a good idea, but someone wasn't convinced. I think it's probably a reasonable restriction, and means that you can use the identifiers as names for the generated test binaries/functions/etc, so it's something that I'd like feedback on. However, I wanted the text of the RFC to be consistent, so I removed references to that restriction for now.

casey avatar Sep 22 '22 04:09 casey

Thanks for the feedback! I added the requirement that names be valid rust identifiers, and used spaces to separate attributes in the examples instead of commas.

casey avatar Jan 25 '23 05:01 casey

Friendly ping!

casey avatar Apr 30 '24 20:04 casey

This probably needs re-thought through to match up with https://github.com/rust-lang/rust/pull/110800. I'll start FCP if that's done, since this change otherwise looks good.

notriddle avatar Apr 30 '24 20:04 notriddle

cc @GuillaumeGomez

Are commas the preferred way (or only?) way to separate annotations on a code block? I used spaces originally, since they were accepted by the parser, but I'm not sure if that changed.

I think that's the only change here that needs to be made.

casey avatar Apr 30 '24 22:04 casey

Oh damn, completely forgot to stabilize this feature. Gonna send a PR for that.

Commas are the only character to separate annotations currently. To add your feature, I think you just need to check for name in here. Once done and ready, please write a comment here so we review it one last time. 👍

GuillaumeGomez avatar May 01 '24 12:05 GuillaumeGomez

@GuillaumeGomez Nice! Okay, updated the RFC to use commas in the example about separating attribute names. I think this is ready to go.

casey avatar May 15 '24 00:05 casey

Considering https://github.com/rust-lang/rust/pull/124577 is about to get stabilized, I'm not sure if name is good enough now (is name supposed to be an attribute or a doctest name? Can't know unless you know very well how it works).

Another thing to note is that you will be able to add attributes to generated codeblocks with this syntax: custom,{name=something}, meaning that in your case, name here would not name the doctest but instead be an attribute generated in the HTML. So better underline this case in the RFC.

In any case, I think renaming name into doctest_name would be better. What do you think?

GuillaumeGomez avatar May 15 '24 14:05 GuillaumeGomez

Another thing to note is that you will be able to add attributes to generated codeblocks with this syntax: custom,{name=something}, meaning that in your case, name here would not name the doctest but instead be an attribute generated in the HTML. So better underline this case in the RFC.

The docs don't say that, and the code seems to warn on "unsupported attributes" without storing them anywhere, so I don't think that's true.

If that PR is supposed to add this feature, it'll need fixed and should probably be pointed out to everyone else.

(Also, I don't see a problem with name being an attribute and a doctest name at the same time. But that's not really relevant.)

In any case, I think renaming name into doctest_name would be better. What do you think?

I think that's a bad idea. The custom classes should be verbose, because it's a feature intended to cover a corner case. This feature, however, is something we want people to use all the time. It should be terse and simple.

notriddle avatar May 15 '24 15:05 notriddle

Good catch, you can see that key-values only only certain values here.

GuillaumeGomez avatar May 15 '24 15:05 GuillaumeGomez

I personally think name is better, just because it's short, and the people who use this feature will likely name every doctest, so they'll be writing out the attribute a bunch of times. Another option would be test=NAME, which is also short.

casey avatar May 15 '24 21:05 casey

So, if we're essentially on board with calling it name, I'll start the FCP on that one.

@rfcbot fcp merge

notriddle avatar May 15 '24 23:05 notriddle

Team member @notriddle has proposed to merge this. The next step is review by the rest of the tagged team members:

  • [x] @GuillaumeGomez
  • [x] @Manishearth
  • [x] @Nemo157
  • [ ] @aDotInTheVoid
  • [ ] @camelid
  • [x] @fmease
  • [ ] @jsha
  • [x] @notriddle

Concerns:

  • braced-unbraced (https://github.com/rust-lang/rfcs/pull/3311#issuecomment-2119175785)
  • hard-errors (https://github.com/rust-lang/rfcs/pull/3311#issuecomment-2119175785)
  • interoperability (https://github.com/rust-lang/rfcs/pull/3311#issuecomment-2121408221)
  • ~~raw-idents~~ resolved by https://github.com/rust-lang/rfcs/pull/3311#issuecomment-2119238580

Once a majority of reviewers approve (and at most 2 approvals are outstanding), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up!

See this document for info about what commands tagged team members can give me.

rfcbot avatar May 15 '24 23:05 rfcbot

One thing to note is that the eBNF doesn't currently allow to have key=value outside of curly braces since we followed pandoc syntax. I suppose it's ok since @notriddle approved. ;)

GuillaumeGomez avatar May 16 '24 09:05 GuillaumeGomez

Or put it inside the braces. I just didn’t catch that.

notriddle avatar May 16 '24 12:05 notriddle

Then name is a valid element attribute, so I'm not sure we should keep using name for this case as it would also modify the generated HTML documentation.

GuillaumeGomez avatar May 16 '24 12:05 GuillaumeGomez

This is a very nice feature!

@rfcbot concern braced-unbraced

As mentioned above, we need to decide if we'd like extend the code block attr grammar to support bare name=IDENT or if we would like to use/extend the existing[^1] pandoc syntax {...,name=IDENT,...}.

@rfcbot concern raw-idents

I guess it's implied by the use of the word Rust identifier but I'd like to raise awareness that this includes raw identifiers (prefix: r#) which is good. I think we should explicitly mention this in the RFC text.

@rfcbot concern hard-errors

Multiple name=IDENT words may not appear in a single code block's info string.

I'd like to see this clarified. I'm pretty sure we don't want to emit a hard error for duplicate names but merely a warn-by-default rustdoc lint for backward compatibility.

Edit: The same applies to other syntax errors (e.g., RHS not being a valid ident).

@rfcbot reviewed

[^1]: Currently feature-gated under custom_code_classes_in_docs

fmease avatar May 19 '24 09:05 fmease

Thank you for the review!

@rfcbot concern braced-unbraced

As mentioned above, we need to decide if we'd like extend the code block attr grammar to support bare name=IDENT or if we would like to use/extend the existing[^1] pandoc syntax {...,name=IDENT,...}.

I personally would prefer the unbraced name=IDENT syntax. Correct me if I'm wrong, but the pandoc braced syntax is only used for adding classes to the generated HTML, and it's also rather verbose.

@rfcbot concern raw-idents

I guess it's implied by the use of the word Rust identifier but I'd like to raise awareness that this includes raw identifiers (prefix: r#) which is good. I think we should call this out explicitly mention this in the RFC text.

I noted that raw identifiers are allowed.

@rfcbot concern hard-errors

Multiple name=IDENT words may not appear in a single code block's info string.

I'd like to see this clarified. I'm pretty sure we don't want to emit a hard error for duplicate names but merely a warn-by-default rustdoc lint for backward compatibility.

In the same vein, should name=NOT_AN_IDENT warn-by-default as well, and not be used as a test name?

casey avatar May 19 '24 10:05 casey

In the same vein, should name=NOT_AN_IDENT warn-by-default as well, […]?

Yes indeed, I forgot to mention that.

https://github.com/rust-lang/rfcs/pull/3311/commits/7f90cc4781a421d5bc5a26b486c1909a593b8602

@rfcbot resolve raw-idents

fmease avatar May 19 '24 13:05 fmease

should [IDENT in name=IDENT] not be used as a test name?

That's a separate point. And I honestly don't know if it's feasible or worth impl'ing this, namely checking that doctest names are unique within the "#[test] namespace" of the current module (including other doctest names). Yeah, maybe we should actually check this and emit a warn-by-default lint on name clashes.

fmease avatar May 19 '24 13:05 fmease

That's a separate point. And I honestly don't know if it's feasible or worth impl'ing this, namely checking that doctest names are unique within the "#[test] namespace" of the current module (including other doctest names). Yeah, maybe we should actually check this and emit a warn-by-default lint on name clashes.

Just updated the RFC. If the idea is that name=IDENT doctests will potentially be instantiated as #[test] fn test() { … } items, then they should both be unique across doctests, but also unique across all items in the current namespace.

casey avatar May 20 '24 20:05 casey

Wait a minute. It reverted back to commas?

@rfcbot concern interoperability

We seem to be going in circles.

The original commit, https://github.com/rust-lang/rfcs/pull/3311/commits/7641de1ba91ab75369c4b0c616032e84959165df, from 2022, used the syntax rust,name=foo.

I objected to this syntax, citing a Babelmark run that showed most CommonMark engines interpreting this as adding <pre><code class="rust,name=foo">, when it's supposed to be class="rust". This matters because people use Rustdoc to execute doctests from external markdown files, even when they're using mdBook or GitHub to view them.

In https://github.com/rust-lang/rfcs/pull/3311/commits/cc3c07a46e8f330f4d843ee6c6ece24f544649eb, this was resolved by separating the name= attribute from the class with a space (which achieves roughly the same interoperability as Pandoc's syntax).

This syntax needs to be designed so that most markdown renderers treat it as separate from the language tag. Allowing commas is fine, but using a space to separate the name from the language needs to work.

notriddle avatar May 20 '24 23:05 notriddle

Commas are the only character to separate annotations currently.

That doesn't seem to be true.

https://github.com/rust-lang/rust/blob/b92758a9aef1cef7b79e2b72c3d8ba113e547f89/src/librustdoc/html/markdown/tests.rs#L286

notriddle avatar May 20 '24 23:05 notriddle