ocamlformat icon indicating copy to clipboard operation
ocamlformat copied to clipboard

Respecting OCaml programming guidelines

Open gpetiot opened this issue 7 years ago • 4 comments

Link: https://ocaml.org/learn/tutorials/guidelines.html

  • [x] Delimiters:

    • [x] A space should always follow a delimiter symbol
      • Ofmt: by default, option field-space allows to add a space at the left of a delimiter symbol (disabled by default)
    • [x] Spaces should surround operator symbols.
      • Ofmt: by default
  • [x] Tuples:

    • [x] A tuple is parenthesized and the commas therein (delimiters) are each followed by a space.
      • Ofmt: option parens-tuple takes care of this, always by default
    • [x] Not adding parentheses can be accepted for pairs.
      • Ofmt: new option parens-tuple-patterns (#498)
    • [x] Also accepted when matching on several values simultaneously.
      • Ofmt: new option parens-tuple-patterns (#498)
  • [x] Lists:

    • [x] Write x :: l with spaces around the :: (since :: is an infix operator, hence surrounded by spaces)
      • Ofmt: by default
    • [x] Write [1; 2; 3] (since ; is a delimiter, hence followed by a space).
      • Ofmt: by default
  • [x] Symbol operators: Be careful to keep operator symbols well separated by spaces: not only will your formulas be more readable, but you will avoid confusion with multi-character operators. (Obvious exceptions to this rule: the symbols ! and . are not separated from their arguments.) Example: write x + 1 or x + !y.

    • Ofmt: by default
  • [x] Long character strings: Indent long character strings with the convention in force at that line plus an indication of string continuation at the end of each line (a \ character at the end of the line that omits white spaces on the beginning of next line).

    • Ofmt: option break-string-literals takes care of this, wrapping by default, can be disabled
  • [ ] Indentation: The change in indentation between successive lines of the program is generally 1 or 2 spaces. Pick an amount to indent and stick with it throughout the program.

    • Ofmt: Not always true (issue #505)
  • [ ] Indentation of global let bindings:

    • [x] regular indentation of 1 or 2 spaces
      • Ofmt: 2 by default, could be an option
    • [ ] the body can be left-justified in the case of pattern-matching
      • Ofmt: could be an option: issue #265
    • [ ] the body can be justified just under the name of the defined function
      • Ofmt: Could be an option (equivalent to option base of ocp-indent (issue #476)
  • [x] Indentation of let ... in bindings:

    • [x] The expression following a definition introduced by let is indented to the same level as the keyword let, and the keyword in which introduces it is written at the end of the line. In the case of a series of let definitions, the preceding rule implies that these definitions should be placed at the same indentation level.
      • Ofmt: by default
    • [x] Variation: some write the keyword in alone on one line to set apart the final expression of the computation.
      • Ofmt: by default, in case of multi-line, could be an option (issue #506)
  • [ ] Indentation of if ... then ... else ... (multiple branches):

    • [ ] Write conditions with multiple branches at the same level of indentation. If the sizes of the conditions and the expressions allow, the keywordd else can end the line.
      • Ofmt: not especially readable but could be an option (issue #504)
    • [ ] If expressions in the branches of multiple conditions have to be enclosed (when they include statements for instance), the blocks start with then begin and end with end else.
      • Ofmt: done with parens by default, could be an option to use begin/end instead
    • [x] Some suggest another method for multiple conditionals, starting each line by the keyword else.
      • Ofmt: by default the option if-then-else is set to compact but keyword-first puts keywords first
  • [x] Indentation of if ... then ... else ... (single branches):

    • [x] In the case of delimiting the branches of a conditional, several styles are used: ( at the end of the line, or begin at the beginning of the line.
      • Ofmt: by default, ( at the end of the line, could be an option to use begin instead
    • [x] If cond, e1 and e2 are small, simply write them on one line.
      • Ofmt: if-then-else=compact by default
  • [x] Pattern-matching:

    • [x] All the pattern-matching clauses are introduced by a vertical bar, including the first one.
      • Ofmt: by default break-cases=fit but can be set to all
    • [x] Align all the pattern-matching clauses at the level of the vertical bar which begins each clause, including the first one.
      • Ofmt: by default
    • [x] If an expression in a clause is too large to fit on the line, you must break the line immediately after the arrow of the corresponding clause. Then indent normally, starting from the beginning of the pattern of the clause.
      • Ofmt: by default
    • [x] Arrows of pattern matching clauses should not be aligned.
      • Not aligned at the moment but could be an option in the future (https://github.com/ocaml-ppx/ocamlformat/issues/360)
  • [ ] match/try:

    • [x] For a match or a try align the clauses with the beginning of the construct
      • Ofmt: by default
    • [ ] Put the keyword with at the end of the line. If the preceding expression extends beyond one line, put with on a line by itself.
      • Ofmt: if the expressions following with fit on the line, with is not on a line by itself, could be an option (issue #507)
  • [x] Expressions inside clauses:

    • [x] If the expression on the right of the pattern matching arrow is too large, cut the line after the arrow.
      • Ofmt: by default
    • [x] Alternatively, the line can be cut after the arrow no matter the length of the pattern.
      • Ofmt: option break-cases=all (but also breaks or-cases)
    • [x] Careful alignment of the arrows of a pattern matching is considered bad practice.
      • Ofmt: not done
  • [x] Pattern matching in anonymous functions: Similarly to match or try, pattern matching of anonymous functions, starting by function, are indented with respect to the function keyword.

    • Ofmt: by default
  • [x] Pattern matching in named functions:

    • [x] Pattern-matching in functions defined by let or let rec gives rise to several reasonable styles which obey the preceding rules for pattern matching (the one for anonymous functions being evidently excepted).
      • Ofmt: by default
    • [x] Don't indent under the keyword match or function which has previously been pushed to the right, indent the line under the let keyword.
      • Ofmt: by default
  • [x] Indentation of the function's name:

    • [x] You must indent the expressions with respect to the name of the function (1 or 2 spaces according to the chosen convention).
      • Ofmt: by default
    • [x] Write small arguments on the same line, and change lines at the start of an argument.
      • Ofmt: by default
  • [x] Indentation of the operations: When an operator takes complex arguments, or in the presence of multiple calls to the same operator, start the next the line with the operator, and don't indent the rest of the operation.

    • Ofmt: by default
  • [x] How to delimit constructs in programs: When it is necessary to delimit syntactic constructs in programs, use as delimiters the keywords begin and end rather than parentheses. However using parentheses is acceptable if you do it in a consistent, that is, systematic, way. This explicit delimiting of constructs essentially concerns pattern-matching constructs or sequences embedded within if then else constructs.

    • Ofmt: parentheses are used instead of begin/end, it could be an option (issue #491)
  • [x] match construct in a match construct: When a match ... with or try ... with construct appears in a pattern-matching clause, it is absolutely necessary to delimit this embedded construct (otherwise subsequent clauses of the enclosing pattern-matching construct will automatically be associated with the enclosed pattern-matching construct).

    • Ofmt: by default
  • [x] Sequences inside branches of if: A sequence which appears in the then or else part of a conditional must be delimited.

    • Ofmt: by default (with parentheses)
  • [x] Opening modules: Avoid open directives, using instead the qualified identifier notation.

    • Ofmt: by default the let-open preserves the initial style, but the short value forces the qualified identifier notation

gpetiot avatar Nov 12 '18 10:11 gpetiot

One thing that needs to be kept in mind here is that these guidelines were written in the context of "guidelines" for human authors, where bad/confusing corner cases could be avoided by selectively not following the guidelines where necessary. Options to implement rules that follow these guidelines will still need to be evaluated on a lot of code to see what corner cases come up.

jberdine avatar Nov 15 '18 09:11 jberdine

Another thing to bear in mind is that these guidelines are not set in stone. If there is some corner case that greatly simplifies ocamlformat (or just a genuine departure from them due to an evolution of the language), then we can submit recommended changes upstream to ocaml.org.

avsm avatar Nov 20 '18 10:11 avsm

Agreed.

Also, there are cases where formatting code in a particular way is very onerous and tedious to do manually, but can still be more clearly legible / informative of how the parser interprets the code. One of the benefits of an auto-formatter is the ability to depart from guidelines for manual formatting in such cases.

jberdine avatar Nov 20 '18 11:11 jberdine

Could we get examples and counter-examples for each bullet point? Please note that I landed here from https://discuss.ocaml.org/t/ocamlformat-and-the-ocaml-org-coding-guidelines/2922 and I'm only here to get a quick glance.

For example, in the List section:

do:
  [1; 2; 3]
don't:
  [ 1; 2; 3 ]
  [1;2;3]

So if there's a weird formatting rule, it will stand out. Thanks!

mjambon avatar Nov 20 '18 17:11 mjambon