graph-prototype icon indicating copy to clipboard operation
graph-prototype copied to clipboard

clang-format-18 settings discussion

Open drslebedev opened this issue 10 months ago • 9 comments

In this issue, we aim to evaluate and decide on the code formatting style to be used for GNURadio-4.0.

I have tested all the preconfigured formats: LLVM, GNU, Google, Chromium, Microsoft, Mozilla, and WebKit. Additionally, I have prepared a custom format. The tested files are attached to the issue.

The primary goal was to create vertically dense code. If it's feasible to express something as a one-liner, we should do so, rather than spreading the code across multiple lines.

Also if possible use standard formats with minimum modifications.

This is the .clang-format file for predefined formats:

BasedOnStyle: LLVM # Options: LLVM, GNU, Google, Chromium, Microsoft, Mozilla, WebKit

.clang-format file for custom format :

BasedOnStyle: LLVM
AllowShortCaseLabelsOnASingleLine: true
ColumnLimit: 2000
AlignAfterOpenBracket: DontAlign
BreakBeforeBraces: Attach
IndentWidth: 4
AccessModifierOffset: -4
IndentRequiresClause: false

Let's discuss which style best fits our needs for GNURadio-4.0.

https://gist.github.com/drslebedev/c9456aba6a3a29caaf33bcf8d1da6768

drslebedev avatar Apr 25 '24 14:04 drslebedev

I don't have any problems with this. I looked at the link above to compare the different formats. I think I prefer the chromium format the most. The custom format is more compact than I would like. I don't have a strong opinion since an automated tool will enforce compliance.

jsallay avatar Apr 29 '24 15:04 jsallay

The custom formats have some very long lines. In the Gist you shared I need to use the horizontal scroll bar to see the end of some lines, but the horizontal scroll bar is at the bottom of the file, so it's rather inconvenient. This is just an example of why long lines can be problematic, but I imagine that other editors/tools/people might find their own problems with them.

daniestevez avatar Apr 29 '24 15:04 daniestevez

Thank you very much for your feedback.

I would like to clarify an important point concerning the length of code lines. Our goal is not to deliberately write long lines of code! Rather we aim to keep code lines short. We strive to adopt a formatting style that is sensitive to the context of the code, allowing for longer lines when they enhance clarity and fit the context appropriately.

It's important to note that reviewers can always request line breaks if they find a line excessively long. This flexibility is a key part of our review process.

Additionally, we have a practical method for breaking lines when needed. By inserting a comment marker // at the end of a line, we can effectively indicate where a line should be broken. On the other hand, setting a rigid column limit restricts our ability to prevent automatic line breaks by clang-format, unless we explicitly disable and re-enable formatting with clang-format off/on commands.

Additionally, a very long line of code often points to underlying problems, like a complex function or one with too many parameters. It's usually helpful to review and possibly simplify or refactor such cases. Whenever possible try to keep lines short which makes it easier to maintain and read!

drslebedev avatar Apr 30 '24 10:04 drslebedev

Same as @daniestevez here. _custom line 10 is too long to be readable, an autoformatting should absolutely have broken that line:

template <typename TInput, typename TOutput = std::conditional<gr::meta::complex_like<TInput>, TInput, std::complex<typename TInput::value_type>>>

Same for L 48

I'd advocate for sticking with chromium, LLVM or Mozilla, just to avoid these overly long lines, and to not be "unusual" for experienced external readers.

Overall, I think having to mentally combine multiple lines into one expression is less problematic than having to scroll horizontally, or having to insert manual breaks using comment lines. That's a cure worse than the disease.

Additionally, a very long line of code often points to underlying problems,

Wait, are you now arguing for or against "vertically dense code"? ;-)

Honestly, I don't care too much, but I don't think "dense" is an important goal at modern screen sizes. "Easy to read, even for people who might not be that familiar with ultra-modern C++ and nested templates, as employed by the graph-proto" on the other hand ranks very high – so, if anything, I'd advocate against vertically dense.

marcusmueller avatar Apr 30 '24 21:04 marcusmueller

Thank you for the robust discussion on our code formatting approach for GNURadio-4.0. Staying clear of bike-shedding, I’d like to underscore the importance of maintaining “reasonably” dense vertical code and advocate for a flexible, context-driven approach to line lengths rather than rigid, automated limits.

Here are the main arguments for this approach:

  1. Strategic Line Breaks and High-Level Intent: our code should balance cognitive load and emphasise architectural and algorithmic intent. Compact, horizontally aligned code leverages our natural reading habits for intuitive understanding (System 1 thinking), while strategic vertical breaks act as cues that signal important transitions or logical sections that should prompt more attention (System 2 thinking). For example, constructs on the right-hand side, such as trailing return types and complex template expressions (mentioned above), are often less critical for immediate comprehension and should thus be de-emphasised in favour of maintaining logical continuity. This focus should be directed towards the big picture of the high-level algorithmic intent in code reviews, rather than getting bogged down in syntactic precision that compilers can handle better than humans.

  2. Context Sensitivity and Logical Flow: not all coding guidelines are binary decisions, especially those concerning line lengths. Clarity and readability depend on the specific context of the code, which automated tools like clang-format fail to capture. These tools often enforce line breaks that disrupt logical flow, obscuring the code’s intent or introducing additional complications, where human reviewers would easily recognise these nuances. Instead of merely breaking long lines, the goal should be to refactor/simplify these through well-named constexpr variables, lambdas, or functions — particularly -- if there are already STL algorithms for that purpose.

  3. Leveraging Tools for Syntax, Not Semantics: modern development environments can easily handle long lines visually, without altering the underlying code. However, the opposite does not hold true, and IDEs usually cannot inline simple expressions that do not merit a multi-line statement. Moreover, while CI, unit tests, and static analysis tools manage syntactic minutiae effectively, they lack the capability to assess complex logical constructs.

A formatting approach that prioritises readability, context, and logical consistency is more beneficial than one strictly dictated by automated tools. Let’s aim to use these tools as aids not as arbiters.

RalphSteinhagen avatar May 01 '24 09:05 RalphSteinhagen

In addition to what @drslebedev already provided: I propose to also column align types, field/variable names, and value assignments within blocked declarations.

This column alignment formatting style greatly enhances readability by making common structures more apparent and simplifying the visual scanning of code. It not only simplifies the scanning and understanding of relationships between variables but also facilitates the quick recognition of patterns and related variables. This is particularly useful in sections with numerous declarations and helps in the easier detection of anomalies such as missing/wrong types or values, thus reducing errors and streamlining code maintenance.

Proposed clang-format Settings:

Here are the suggested adjustments to our .clang-format configuration to support this style:

AlignConsecutiveAssignments: true
AlignConsecutiveDeclarations: true
AlignConsecutiveMacros: true
MaxEmptyLinesToKeep: 1

Example Comparison:

N.B. the Annotated<...> wrapper is needed to have compile-time definitions and localised documentation (code being the single source of truth) that is/can be used, for example, for additional type-, range-, and unit-checks, and also in the UI.

without column alignment:

using base_t = Block<Derived, Arguments...>;
using derived_t = Derived;
using ArgumentsTypeList = typename gr::meta::typelist<Arguments...>;
using block_template_parameters = meta::typelist<Arguments...>;
using Resampling = ArgumentsTypeList::template find_or_default<is_resampling_ratio, ResamplingRatio<1UL, 1UL, true>>;
using StrideControl = ArgumentsTypeList::template find_or_default<is_stride, Stride<0UL, true>>;
using AllowIncompleteFinalUpdate = ArgumentsTypeList::template find_or_default<is_incompleteFinalUpdatePolicy, IncompleteFinalUpdatePolicy<IncompleteFinalUpdateEnum::DROP>>;
using DrawableControl = ArgumentsTypeList::template find_or_default<is_drawable, Drawable<UICategory::None, "">>;
constexpr static bool blockingIO = std::disjunction_v<std::is_same<BlockingIO<true>, Arguments>...> || std::disjunction_v<std::is_same<BlockingIO<false>, Arguments>...>;

template<typename U, gr::meta::fixed_string description = "", typename... Arguments>
using A = Annotated<U, description, Arguments...>;

A<std::string, "filter", Visible, Doc<"syntax: '[<start trigger name>/<ctx1>, <stop trigger name>/<ctx2>]'">> filter;
A<gr::Size_t, "n samples pre", Visible, Doc<"number of pre-trigger samples">> n_pre = 30U;
A<gr::Size_t, "n samples post", Visible, Doc<"number of post-trigger samples">> n_post = 30U;
A<gr::Size_t, "n samples max", Doc<"maximum number of samples (0: infinite)">> n_max = 0U;

with column alignment:

using base_t                     = Block<Derived, Arguments...>;
using derived_t                  = Derived;
using ArgumentsTypeList          = typename gr::meta::typelist<Arguments...>;
using block_template_parameters  = meta::typelist<Arguments...>;
using Resampling                 = ArgumentsTypeList::template find_or_default<is_resampling_ratio, ResamplingRatio<1UL, 1UL, true>>;
using StrideControl              = ArgumentsTypeList::template find_or_default<is_stride, Stride<0UL, true>>;
using AllowIncompleteFinalUpdate = ArgumentsTypeList::template find_or_default<is_incompleteFinalUpdatePolicy, IncompleteFinalUpdatePolicy<IncompleteFinalUpdateEnum::DROP>>;
using DrawableControl            = ArgumentsTypeList::template find_or_default<is_drawable, Drawable<UICategory::None, "">>;
constexpr static bool blockingIO = std::disjunction_v<std::is_same<BlockingIO<true>, Arguments>...> || std::disjunction_v<std::is_same<BlockingIO<false>, Arguments>...>;

template<typename U, gr::meta::fixed_string description = "", typename... Arguments>
using A = Annotated<U, description, Arguments...>;

A<std::string, "filter", Visible, Doc<"syntax: '[<start trigger name>/<ctx1>, <stop trigger name>/<ctx2>]'">> filter;
A<gr::Size_t, "n samples pre", Visible, Doc<"number of pre-trigger samples">>                                 n_pre  = 30U;
A<gr::Size_t, "n samples post", Visible, Doc<"number of post-trigger samples">>                               n_post = 30U;
A<gr::Size_t, "n samples max", Doc<"maximum number of samples (0: infinite)">>                                n_max  = 0U;

RalphSteinhagen avatar May 01 '24 09:05 RalphSteinhagen

Just a personal preference here, but I like without column alignment better. With column alignment, in using base_t [...] = Block <Derived, Arguments...> my eyes have to scan forward a significant amount to read the value and also hope that they don't jump one line up or down by mistake (which could happen when skipping a significant amount of spaces). I've seen column alignment more commonly used in hardware design languages, but never really understood why people insist on doing it (specially since sometimes this formatting is done by hand instead of with automated tools such as clang-format). To me the code formatted with column alignment definitely looks prettier, but seems less readable in practice.

daniestevez avatar May 01 '24 09:05 daniestevez

I agree with @daniestevez about finding the without column alignment easier to read.

Also I think a compromise position may be in order for the max line lengths. I agree that having a max that is too short (such as 80 chars) can greatly inhibit readability, but having no line length can also be difficult. I copy and pasted the custom example into VS Code, which is a by all measures a modern IDE and I don't see it doing anything to help me process the really long lines. I still have to scroll horizontally back and forth.

I think we can pick a reasonable value, that balances both concerns. I would throw out something between 120-150 characters per line.

jsallay avatar May 01 '24 12:05 jsallay

@jsallay the argument is not about having 2000 character long lines but that this should not be automatically enforced by tools such as clang-format but based on the code context and human review.

For example, if one chooses 120 characters (as an arbitrary enforced limit), there are and will be cases where a line (due to nested indentation etc.) will be 121 characters with its trailing ;, code (or CI/sanitizer) comments, or trailing templated return types. In practice, many developers opt for using auto to simplify code, particularly in user-oriented scripts, where the type isn't important for the human algorithmic understanding, however, cascaded/constraint template names are important for applying type-safety in the core lib-code. Thus trailing return types and constraints ... the unimportant bits are at the back.

The rationale behind the 'column alignment' is that one can more easily distinguish type, variable name, and assigned value.

For example: in the real-world 'without column alignment' example above ... my bet is ... that anybody will find it hard to identify the variable along the annotated meta-info, which becomes even harder if one uses templated class constructors instead of assignment operators (or both).

The goal should be to adapt our tools and practices to the context of the code and its nuanced intended readability and maintainability, rather than adhering strictly to arbitrary rules.

RalphSteinhagen avatar May 01 '24 19:05 RalphSteinhagen