icu4x icon indicating copy to clipboard operation
icu4x copied to clipboard

Decide and implement formatting of generated code

Open sffc opened this issue 8 months ago • 11 comments
trafficstars

Follow-up to https://github.com/unicode-org/icu4x/pull/6297#discussion_r2000629914

We should figure out a solution for the formatting of generated code files, such as the datetime FFI code files.

There are a few general approaches:

  1. Format in cargo-make (status quo as of posting this issue)
  2. Format in the tool itself
  3. Do not format and add #[rustfmt::skip] to the generated files

These all have pros and cons.

@Manishearth @robertbastian

sffc avatar Mar 20 '25 02:03 sffc

We don't generate diplomat-generated code. I think it was @sffc who opposed running dart format and said that it should just try to be readable.

Files should be marked as //@generated regardless. Then we can decide whether to invoke rustfmt with format_generated_files.

robertbastian avatar Mar 20 '25 10:03 robertbastian

So generally the principle I follow is that generated code should not be unreadable. Running a formatter is an easy way to fix certain types of problems, for example Rust code generated via quote/syn is ugly unless you use prettyplease or rustfmt, which is what we do in datagen for baked data.

The generated code is not unreadable here: our templates are clean enough. So I think we do not have to format. In fact, formatting generated code like this might be counterproductive, having foo.bar(.......) in one situation and foo.bar(\n\t\t....,\n\t\t...,\n\t) in another is not great.

My preference is not formatting these, with skips on the module. If we do decide to format these, it should be handled by the tool itself.

Manishearth avatar Mar 20 '25 15:03 Manishearth

We don't need skips, just // @generated in the first line of each generated file.

robertbastian avatar Mar 20 '25 17:03 robertbastian

We don't generate diplomat-generated code. I think it was @sffc who opposed running dart format and said that it should just try to be readable.

My objection to dart format was not the formatting, but that I didn't want diplomat-gen to require dart to be installed (at a particular version, etc).

And, we do require dart code to be formatted, it seems, but this is enforced in test-dart. Whenever I run that command with unformatted code, it formats it and fails.

Files should be marked as //@generated regardless. Then we can decide whether to invoke rustfmt with format_generated_files.

This seems fine if rustfmt obeys // @generated.

sffc avatar Mar 20 '25 17:03 sffc

My objection to dart format was not the formatting, but that I didn't want diplomat-gen to require dart to be installed (at a particular version, etc).

That's the same reason why nightly and beta CI are currently failing, rustfmt is not installed.

And, we do require dart code to be formatted, it seems, but this is enforced in test-dart. Whenever I run that command with unformatted code, it formats it and fails.

That formats non-generated Dart code, i.e icu_test.dart.

robertbastian avatar Mar 20 '25 18:03 robertbastian

Files should be marked as //@generated regardless. Then we can decide whether to invoke rustfmt with format_generated_files.

That doesn't currently work

https://github.com/rust-lang/rustfmt/issues/5080

I get the following warning printed hundreds of times when I run cargo fmt with format_generated_files = false in my rustfmt.toml

Warning: can't set `format_generated_files = false`, unstable features are only available in nightly channel.

My ideal outcome is that we apply some, but not all, formatting to these files. For example, lines with trailing whitespace are just, like, pure filesystem bloat, and some editors render them in scary colors.

sffc avatar Mar 21 '25 05:03 sffc

The format_generated_files flag is unstable, but the stable behaviour is to not format generated files. This is what is already happening for baked data.

My ideal outcome is that we apply some, but not all, formatting to these files. For example, lines with trailing whitespace are just, like, pure filesystem bloat, and some editors render them in scary colors.

It's fairly easy to not generate code with trailing whitespace.

robertbastian avatar Mar 21 '25 08:03 robertbastian

The format_generated_files flag is unstable, but the stable behaviour is to not format generated files. This is what is already happening for baked data.

https://github.com/rust-lang/rustfmt/blob/master/Configurations.md says Default value: true

And when I tried it locally, the presence of @generated didn't cause the file to not be formatted in ffi/capi/src.

I thought baked data was solved because we store everything in .rs.data files.

My ideal outcome is that we apply some, but not all, formatting to these files. For example, lines with trailing whitespace are just, like, pure filesystem bloat, and some editors render them in scary colors.

It's fairly easy to not generate code with trailing whitespace.

Sorry. "we" was intended to mean "the authors of the jinja templates". My ideal outcome is that rustfmt can run on the generated code with certain settings and not create a diff.

sffc avatar Mar 21 '25 08:03 sffc

Ultimately I don't care much about trailing whitespace in jinja-generated files.

Manishearth avatar Mar 21 '25 20:03 Manishearth

The next steps here:

  1. Fix the templates to not emit trailing whitespace
  2. Figure out how to enforce this in CI

sffc avatar Apr 25 '25 00:04 sffc

Note for implementors: https://github.com/rust-diplomat/diplomat/pull/849 contains some techniques for making this happen, and a test.

Manishearth avatar Apr 25 '25 00:04 Manishearth