tuareg icon indicating copy to clipboard operation
tuareg copied to clipboard

Insane SMIE indentation

Open drvink opened this issue 6 years ago • 10 comments

This has been driving me berserk ever since the change to the SMIE-based parser (I was using the old one because it didn't have this issue, but after it was removed, I didn't feel like maintaining yet another personal fork of some mode):

wtf-tuareg

Why does tuareg insist on typing on the wrong side of the road? sml-mode is similarly broken (in fact I can't think of a mode that hasn't gotten worse after adopting SMIE). I'm using Emacs 25.3 and tuareg-mode 20180325.47.

drvink avatar Apr 22 '18 05:04 drvink

This is just because it “thinks” you will continue the previous expression. For example, in your second case:

let x = 1
        + 1

Chris00 avatar Apr 22 '18 08:04 Chris00

That is a technically accurate interpretation, but I don't know a single indentation-aware Emacs major mode--any editor, actually--that behaves this way. The default "smart" indentation behavior that is typical is:

(* when the user hits return/enter... *)
match cursor_state with
| Block_starting -> indent ()
| Incomplete_expr -> maybe_indent () (* or just use previous indent *)
| Block_ending -> unindent ()
| _ -> ()

Post-SMIE Tuareg is actually even worse than this:

tuareg-busted

Here's what I'd expect, using F# and C as examples:

emacs-fsharp emacs-c

drvink avatar Apr 22 '18 08:04 drvink

Here's another example where the indentation is quite strange:

let plus x y = x + y<CR>
             | <-- this is where the cursor ends up

I kind of understand that it's possible the continue the expression in this case, but really it would be quite unlikely.

One option to fix this would be to have tab toggle between the candidates.

Also, I would expect a subsequent newline would definitely reset the indentation level to the last line.

rgrinberg avatar Dec 06 '19 02:12 rgrinberg

It would indeed probably be better that, after a newline, the cursor be positioned at the level of the parent expression (until one type something so that SMIE can decide). @monnier Is a modification of SMIE required or can it be fixed in this repo at reasonable cost?

Chris00 avatar Dec 06 '19 08:12 Chris00

It would indeed probably be better that, after a newline, the cursor be positioned at the level of the parent expression (until one type something so that SMIE can decide).

That's pretty much what SMIE does ;-) The problem is to define what one means by "parent expression".

@monnier Is a modification of SMIE required or can it be fixed in this repo at reasonable cost?

No need to change SMIE. The behavior is currently controlled in tuareg-smie-rules by:

           ;; The default tends to indent much too deep.
           ((eq token 'empty-line-token) ";")))

So maybe you want to use a token which binds even less tightly than ";".

Another option is to add our own function to smie-indent-functions:

...(add-hook 'smie-indent-functions #'tuareg-empty-line-indent nil t)

(defun tuareg-empty-line-indent ()
  (and (looking-at "[ \t]*$")
       (smie-rule-bolp)
       ;; Don't change behavior in strings and comments
       (not (nth 8 (syntax-ppss)))
       ...compute and return the column to indent to...))

-- Stefan

monnier avatar Dec 06 '19 13:12 monnier

@monnier Thanks for the explanations. Modifying ((eq token 'empty-line-token) ";"))) works except at the end of file (where the token 'basic is given to the rules function). Not sure to understand the reason for the difference. Should I add a new function to 'smie-indent-functions to handle this case or is there an alternative?

Chris00 avatar Dec 06 '19 17:12 Chris00

@monnier Thanks for the explanations. Modifying ((eq token 'empty-line-token) ";"))) works except at the end of file (where the token 'basic is given to the rules function). Not sure to understand the reason for the difference. Should I add a new function to 'smie-indent-functions to handle this case or is there an alternative?

I don't see such a difference at EOB, but I do see that the empty-line-token case is almost never used because of tuareg-smie--args. I think we need a patch like the one below.

    Stefan

diff --git a/tuareg.el b/tuareg.el index 8c4833a..717876e 100644 --- a/tuareg.el +++ b/tuareg.el @@ -571,8 +571,8 @@ do not perturb in essential ways the alignment are used. See `(("fun" . ?λ) ("not" . ?¬) ;;("or" . ?∨); should not be used as ||

  • ("[|" . ?〚) ;; 〚
  • ("|]" . ?〛) ;; 〛
  • ("[|" . ?〚)
  • ("|]" . ?〛) ("->" . ?→) (":=" . ?⇐) ("::" . ?∷))) @@ -2290,6 +2290,7 @@ whereas with a nil value you get (unless (or tuareg-indent-align-with-first-arg (nth 8 (syntax-ppss)) (looking-at comment-start-skip)
  •          (looking-at "[ \t]*$")  ;; bug#179
             (numberp (nth 1 (save-excursion (smie-indent-forward-token))))
             (numberp (nth 2 (save-excursion (smie-indent-backward-token)))))
    
    (save-excursion

monnier avatar Dec 06 '19 19:12 monnier

Thanks—spot on. The only remaining problem is comments at the end of a buffer (or before end in module ... end). The rule function is also triggered with :elem, 'basic resulting in the same incorrect indentation.

Chris00 avatar Dec 08 '19 18:12 Chris00

Thanks—spot on. The only remaining problem is comments at the end of a buffer

I thought the "spot on" above meant that the EOB case was indeed not special. So, I have no idea what is wrong when indenting at EOB.

(or before end in module ... end).

Same, here, I don't know what you find to be wrong here.

The rule function is also triggered with :elem, 'basic resulting in the same incorrect indentation.

That's not how it works: the rules function can be called several times per indentation, and that's normal. Please read the docstring of smie-rules-function to see what each call is for. There's no a priori reason to believe that the reason for the wrong indentation is that the rules function was called with (:elem . basic). It may here be a symptom (e.g. because smie-indent-empty-line decided not to do anything, so we fall back on the next handler which happens to call the rules function with (:elem . basic) as part of its work).

    Stefan

monnier avatar Dec 08 '19 18:12 monnier

On 8 December 2019 at 19:45 CET, monnier [email protected] wrote:

Thanks—spot on. The only remaining problem is comments at the end of a buffer

I thought the "spot on" above meant that the EOB case was indeed not special. So, I have no idea what is wrong when indenting at EOB.

It meant that, applying your patch to tuareg-smie--args, fixed the EOB problem. Sorry for expressing unclearly.

(or before end in module ... end).

Same, here, I don't know what you find to be wrong here.

I meant that the comment indents as

module X = struct let x = 1 (* comment *) end

while one would expect it to indent as

module X = struct let x = 1 (* comment *) end

That's not how it works: the rules function can be called several times per indentation, and that's normal. Please read the docstring of smie-rules-function to see what each call is for. There's no a priori reason to believe that the reason for the wrong indentation is that the rules function was called with (:elem . basic). It may here be a symptom (e.g. because smie-indent-empty-line decided not to do anything, so we fall back on the next handler which happens to call the rules function with (:elem . basic) as part of its work).

Thanks. I'll have more learning to do as the relationship between the syntax constructs and the indentation rules is not always clear to me. For the example above, tuareg-smie-rules is indeed called with

kind: :list-intro, token: d= kind: :elem, token: args kind: :elem, token: basic

It seems that comments are not treated in any special way. So I guess I have no choice but to add a function to smie-indent-functions for them.

Chris00 avatar Dec 08 '19 21:12 Chris00