elm-format Mixing spaces and tabs result in confusing error message

E.g given the following snippet of code


import Element
import Html


type Book
    = Book String


main : Html a
main =
    Element.layout [] (Element.wrappedRow [] [])

.. if there is a tab before "Element.layout", and spaces before "= Book String", you get this error message:

-- SYNTAX PROBLEM ------------------------------------------------------

I ran into something unexpected when parsing your code!

13│ 	Element.layout [] (Element.wrappedRow [] [])
    ^
I am looking for one of the following things:

    an expression
    whitespace

Not only is it incorrect (tab is considered whitespace!), but it could also likely be much clearer; it could e.g. say something like this:

-- SYNTAX PROBLEM ------------------------------------------------------ 

I ran into something unexpected when parsing your code!

13│ 	Element.layout [] (Element.wrappedRow [] [])
    ^
I found a tab here. Elm is strict about using spaces, sorry.

Feb 08 '20 22:02 objarni

I wouldn't mind helping out fixing this issue, if you give me some pointers on how to do things like adding tests, running them, and such :)

Feb 08 '20 22:02 objarni

What I'm currently expecting for the future is that elm-format will not try to report errors and instead folks should rely on the Elm compiler itself to report syntax errors.

However, I'd be fine accepting a change to this error message if making the change ends up being straightforward and doesn't need to significantly change the parser code. One complication is that elm-format is currently using code forked from Elm 0.16, and I do want to upgrade it to Elm 0.19's faster parser implementation, so potentially any changes to the errors might have to be re-done once the parser upgrade is completed.

FYI there also is an existing old issue about potentially trying to convert tabs to spaces https://github.com/avh4/elm-format/issues/185

If you do want to try working on this, I think the place to put a test for this would be in the tests/test-files/bad folder (follow the pattern of the existing files there). The bottom of the README shows how to run the tests.

Feb 09 '20 04:02 avh4

Oh, I see that dilemma... So elm-format is waiting for a general fix from elm compiler, parse that error as "tabs used" and then convert tabs to spaces? Or how would this be designed?

It almost sounds like the elm compiler, and surrounding tooling like elm-format, would get value from an 'elm parser lib' that can be used by both compiler and elm-format and others? Is there already such a lib? How is elm-format parsing .elm files today?

Many questions - thanks for a great tool! elm-format (on Save) is one of those great things about the whole Elm "experience" that makes it so nice to develop in :)

Feb 09 '20 09:02 objarni

So elm-format is waiting for a general fix from elm compiler, parse that error as "tabs used" and then convert tabs to spaces?

I believe Elm itself is pretty firmly decided on "no tabs" -- though I think the error message might be a bit better in Elm 0.19. But I think auto-fixing the tabs would be nice for elm-format to do if its possible to do in a way that won't mess things up in common cases.

Or how would this be designed?

imo it would start with collecting info about folks who run into this issue with tabs and seeing if there's an interpretation of tabs that would just make things work for most of those folks (in which case we could try implementing that) or if adding a fix would make it better for some and worse for others, then there might not be a change worth implementing.

Separately, indent levels only matter for case and let expressions. So if elm-format were made more lenient in correcting for mismatched indentation, I expect that would make it easier to define a tabs interpretation that would work for most scenarios. (See https://github.com/avh4/elm-format/issues/173, https://github.com/avh4/elm-format/issues/663)

would get value from an 'elm parser lib' that can be used by both compiler and elm-format and others?

I think elm-compiler is not interested in a shared library because its goal is to be as fast as possible. elm-format requires the parser provide information about comments and linebreaks, which adds a bunch of stuff to the parser that slows it down. But for things outside of the elm-compiler, I think a shared lib could make sense, but there hasn't been a need for that yet. (Notably, the IntelliJ plugin has its own parser because it needs to be written in (I think) Java, and the elm-language-server is using the Elm tree-sitter parser because of its fast incremental reparsing.)

Feb 09 '20 17:02 avh4

Thank you for your thorough answer!

Feb 09 '20 21:02 objarni

(Notably, the IntelliJ plugin has its own parser because it needs to be written in (I think) Java, and the elm-language-server is using the Elm tree-sitter parser because of its fast incremental reparsing.)

I believe the parser for intellij is in Kotlin (which also is Java) And while tree sitter can do incremental reparsing, we're not really using that in the server (yet)

Feb 09 '20 22:02 razzeee