elm-format
elm-format copied to clipboard
Nesting |> andThen consumes too much indent
Today I first tried elm-format and shocked with this result. It consumes 12 spaces per one chain.
- model.floor
- |> Maybe.map EditingFloor.present
- |> Maybe.andThen (\floor -> List.head (selectedObjects model)
- |> Maybe.andThen (\primarySelected ->
- ObjectsOperation.nearest direction primarySelected (Floor.objects floor)
- |> Maybe.map (\object ->
- { model |
- selectedObjects =
- List.map Object.idOf [object]
- }
- )))
- |> Maybe.withDefault model
+ model.floor
+ |> Maybe.map EditingFloor.present
+ |> Maybe.andThen
+ (\floor ->
+ List.head (selectedObjects model)
+ |> Maybe.andThen
+ (\primarySelected ->
+ ObjectsOperation.nearest direction primarySelected (Floor.objects floor)
+ |> Maybe.map
+ (\object ->
+ { model
+ | selectedObjects =
+ List.map Object.idOf [ object ]
+ }
+ )
+ )
+ )
+ |> Maybe.withDefault model
In Haskell, this may be flattened by do
notation. So I don't think the former style is too strange.
Another alternative would be the following style.
+ model.floor
+ |> Maybe.map EditingFloor.present
+ |> Maybe.andThen (\floor ->
+ List.head (selectedObjects model)
+ |> Maybe.andThen (\primarySelected ->
+ ObjectsOperation.nearest direction primarySelected (Floor.objects floor)
+ |> Maybe.map (\object ->
+ { model
+ | selectedObjects =
+ List.map Object.idOf [ object ]
+ }
+ )
+ )
+ )
+ |> Maybe.withDefault model
This is 8 spaces per one chain.
And this is 4 spaces.
+ model.floor
+ |> Maybe.map EditingFloor.present
+ |> Maybe.andThen (\floor -> List.head (selectedObjects model)
+ |> Maybe.andThen (\primarySelected -> ObjectsOperation.nearest direction primarySelected (Floor.objects floor)
+ |> Maybe.map (\object ->
+ { model
+ | selectedObjects =
+ List.map Object.idOf [ object ]
+ }
+ )
+ )
+ )
+ |> Maybe.withDefault model
I found this was once discussed before. No progress on this issue since then? I don't have clear idea but wanted to bring it up again before public release. Is it possibly true that elm-lang/core will be formatted in the future? It has similar code too.
I'm facing a similar problem.
It's hard to spot errors where Json.Decode.mapX
has its arguments in the wrong order, so we are using the next best thing that Elm has to a do
notation, which would look something like
decodeEngineResult : Decoder Rules.EngineResult
decodeEngineResult =
Decode.field "consideredCount" Decode.int &> \consideredCount ->
Decode.field "evaluationCount" Decode.int &> \evaluationCount ->
Decode.field "matchedCount" Decode.int &> \matchedCount ->
Decode.field "time" Decode.string &> \time ->
Decode.succeed
{ consideredCount = consideredCount
, evaluationCount = evaluationCount
, matchedCount = matchedCount
, time = time
}
However, elm-format formats the above to:
decodeEngineResult : Decoder Rules.EngineResult
decodeEngineResult =
Decode.field "consideredCount" Decode.int
&> \consideredCount ->
Decode.field "evaluationCount" Decode.int
&> \evaluationCount ->
Decode.field "matchedCount" Decode.int
&> \matchedCount ->
Decode.field "time" Decode.string
&> \time ->
Decode.succeed
{ consideredCount = consideredCount
, evaluationCount = evaluationCount
, matchedCount = matchedCount
, time = time
}
Which is a lot more difficult to read.
BTW, thank you for elm-format. Despite its little shortcomings it's a fantastic tool, I just don't write Elm without it. Please let me know if I can help.
I agree that I think we should find a better way to format this.
However, in @xarvh-at-stax's example, it can be cleaned up using NoRedInk/elm-decode-pipeline:
import Json.Decode.Pipeline exposing (decode, required)
decodeEngineResult : Decoder EngineResult
decodeEngineResult =
decode EngineResult
|> required "consideredCount" Decode.int
|> required "evaluationCount" Decode.int
|> required "matchedCount" Decode.int
|> required "time" Decode.string
Is there an advantage to preferring the lambdas in that case?
@avh4 the advantage is that it's harder to swap arguments by mistake.
The code below is perfectly valid, looks good, but it's wrong.
import Json.Decode.Pipeline exposing (decode, required)
decodeEngineResult : Decoder EngineResult
decodeEngineResult =
decode EngineResult
|> required "evaluationCount" Decode.int
|> required "consideredCount" Decode.int
|> required "matchedCount" Decode.int
|> required "time" Decode.string
The first two fields are swapped, but it's really hard to tell.
Worse, the same problem will happen if we reorder the attributes in the definition of EngineResult
.
(for reference) Here's a version of the original example that actually compiles:
type alias Model =
{ floor : Maybe Floor
, selectedObjects : List Id
}
type Floor
= Floor
type Object
= Object
type Id
= Id
type Direction
= Direction
editingFloor_present : Floor -> Floor
editingFloor_present a =
a
selectedObjects : Model -> List Object
selectedObjects _ =
[]
floor_objects : Floor -> List Object
floor_objects _ =
[]
object_idOf : Object -> Id
object_idOf _ =
Id
objectsOperation_nearest : Direction -> Object -> List Object -> Maybe Object
objectsOperation_nearest _ _ _ =
Nothing
x : Direction -> Model -> Model
x direction model =
model.floor
|> Maybe.map editingFloor_present
|> Maybe.andThen
(\floor ->
List.head (selectedObjects model)
|> Maybe.andThen
(\primarySelected ->
objectsOperation_nearest direction primarySelected (floor_objects floor)
|> Maybe.map
(\object ->
{ model
| selectedObjects =
List.map object_idOf [ object ]
}
)
)
)
|> Maybe.withDefault model
Here's a possible alternative to the originally-posted code:
x : Direction -> Model -> Model
x direction model =
let
primarySelected =
selectedObjects model
|> List.head
floorObjects =
model.floor
|> Maybe.map editingFloor_present
|> Maybe.map floor_objects
|> Maybe.withDefault []
nearestFloorObject primarySelected =
objectsOperation_nearest direction primarySelected floorObjects
in
{ model
| selectedObjects =
primarySelected
|> Maybe.andThen nearestFloorObject
|> Maybe.map object_idOf
|> Maybe.map List.singleton
|> Maybe.withDefault model.selectedObjects
}
This refactoring makes sense a lot. Thank you :)
One point I care about is that the benefit of Html.Lazy.lazy
decreases because re-assigning record property (selectedObjects
) will lose the original record's reference.
Wouldn't this be better?
a
|> \x -> x
|> \y -> y
|> \z -> z
I have a pipe with a single small lamba in the middle of it, and it increases the indentation for no reason, what makes the pipe harder to read.
I just encountered that the experimental version (0.7.0) formats the following code
checkPattern Unpack preTy >>= \preEnv ->
checkPattern Unpack postTy >>= \postEnv ->
checkFlow (Env.merge preEnv env) body >>= \bodyEnv ->
checkSubset postEnv.values bodyEnv.values
like this
checkPattern Unpack preTy
>>= (\preEnv ->
checkPattern Unpack postTy
>>= (\postEnv ->
checkFlow (Env.merge preEnv env) body
>>= (\bodyEnv ->
checkSubset postEnv.values bodyEnv.values
)
)
)
(Yes, I use bind sometimes :wink:)
I can live with the indentation, because I think it makes scoping explicit, but I'm not sure about two things:
- Why are lines after a bind indented two levels?
- Do we need to add parens for the lambda?
Btw: when using andThen
the parens are needed and the formatting uses one indent level at a time.
After fixing the double indentation (1):
checkPattern Unpack preTy
>>= (\preEnv ->
checkPattern Unpack postTy
>>= (\postEnv ->
checkFlow (Env.merge preEnv env) body
>>= (\bodyEnv ->
checkSubset postEnv.values bodyEnv.values
)
)
)
and removing parens (2):
checkPattern Unpack preTy
>>= \preEnv ->
checkPattern Unpack postTy
>>= \postEnv ->
checkFlow (Env.merge preEnv env) body
>>= \bodyEnv ->
checkSubset postEnv.values bodyEnv.values
What do you think? I prefer the parens-less version, it looks less cluttered and they are actually redundant. But removing the double indentation already improves a lot!
Edit: the code looks even better when we remove the indentation before the bind operator. You can perfectly see where each lambda variable is bound! This formatting obviously has consequences for a lot of other operators, like (|>)
though...
checkPattern Unpack preTy
>>= \preEnv ->
checkPattern Unpack postTy
>>= \postEnv ->
checkFlow (Env.merge preEnv env) body
>>= \bodyEnv ->
checkSubset postEnv.values bodyEnv.values
Has there been any more thought on this?
I think it's a major bug with elm-format.
This is ugly, but at least is readable and maintainable:
mapDecoder : Decoder Map
mapDecoder =
(field "name" <| Decode.string) |> Decode.andThen (\name ->
(field "author" <| Decode.string) |> Decode.andThen (\author ->
(field "mainBases" <| listOfTilesDecoder) |> Decode.andThen (\mainBases ->
(field "smallBases" <| listOfTilesDecoder) |> Decode.andThen (\smallBases ->
(field "wallTiles" <| listOfTilesDecoder) |> Decode.andThen (\wallTiles ->
Decode.succeed
{ name = name
, author = author
, mainBases = mainBases
, smallBases = smallBases
, wallTiles = wallTiles
}
)))))
But the elm-format output is:
(field "name" <| Decode.string)
|> Decode.andThen
(\name ->
(field "author" <| Decode.string)
|> Decode.andThen
(\author ->
(field "mainBases" <| listOfTilesDecoder)
|> Decode.andThen
(\mainBases ->
(field "smallBases" <| listOfTilesDecoder)
|> Decode.andThen
(\smallBases ->
(field "wallTiles" <| listOfTilesDecoder)
|> Decode.andThen
(\wallTiles ->
Decode.succeed
{ name = name
, author = author
, mainBases = mainBases
, smallBases = smallBases
, wallTiles = wallTiles
}
)
)
)
)
)
Which is not maintainable by a human.
From what I'm currently seeing in the community, I think the current recommended practice for the last example would be:
mapDecoder : Decoder Map
mapDecoder =
Decode.map5 Map
(field "name" Decode.string)
(field "author" Decode.string)
(field "mainBases" listOfTilesDecoder)
(field "smallBases" listOfTilesDecoder)
(field "wallTiles" listOfTilesDecoder)
Though that still has the issue ghost mentioned https://github.com/avh4/elm-format/issues/352#issuecomment-298824192
Yup. I'd really like to avoid argument swaps, especially since it can happen by just reordering the attributes in the record.
For completeness, there are multiple ways to help avoid mixing up the attributes:
- add nesting to the code, as shown previously in this thread
- define types that prevent mixing up:
type MainBases = MainBases (List Tile)
,(field "mainBases" <| Decode.map MainBases listofTilesDecoder)
- divide the code so there's less room for error:
(field "tiles" tilesDecoder)
(so map.tiles contains mainBases, smallBases, wallTiles)
In what cases should nesting be preferred over other alternatives? I think defining more types is the traditional functional-programming solution to this problem--is it common for people to do that in Elm?
Here are some points to add to the discussion:
- Since the library came out, I have heard about the concern that people can swap arguments in decode pipeline and cause bugs.
- Nearly everyone I talk to who does any JSON decoding uses that library to do it.
- Even with so many people using the library, in practice I do not hear people complaining that they have the pain point where they swapped the arguments and it caused bugs for them.
I understand that this can happen, but I don't think it's fair to say that it's a "major bug with elm-format" that elm-format
doesn't accommodate a technique that would let you avoid using one of the most popular libraries in the Elm ecosystem, out of concern for a problem that isn't reported to be a significant problem in practice.
Fair enough. =)
I do not hear people complaining that they have the pain point concern for a problem that isn't reported to be a significant problem in practice
There are people expressing that pain point in this very thread. Please help keep the conversation here productive by not invalidating other people's direct experiences.
Argh, I'm sorry! I definitely do not want to invalidate anyone's experience.
My response was based my understanding that this is a hypothetical concern rather than one based on direct experience. I read "the advantage is that it's harder to swap arguments by mistake" as distinct from "I have tried it that way and personally experienced the pain of swapping arguments by mistake."
In retrospect, a better way to convey my point might have been to ask for clarification, e.g. "have you tried using decode pipeline to see if this concern materializes for you in practice?"
I should have stated this more clearly myself: it happened a couple of times in production.
Wow, mea maxima culpa! Sorry for assuming incorrectly.
I think I also missed an opportunity to ask a pertinent question to the thread:
Given that Elm's design encourages "one way to do it" (and one of elm-format
's goals is to prevent team arguments over personal preferences), is there a case to be made that there is "one best way to do it" for JSON decoding, or is it better to provide good support for multiple alternative approaches?
Possibilities that occur to me:
-
mapN
and decode pipeline should not be used. It's better to use a nestedandThen
style because it is more resilient to ordering mistakes. If this is true, I see a clear case thatelm-format
should change for the sake of theandThen
style. -
mapN
and decode pipeline should be used, and they are better than using a nestedandThen
style because they are significantly more concise but insignificantly more error-prone. If this is true, I don't see a clear case thatelm-format
should change for the sake of theandThen
style. - All of these styles are fine. If this is true, is one style significantly better than the other for certain projects? (If so, which projects are better suited to which style?) If not, does which style to use come down to personal preference? (If it's mostly personal preference, what happens when some team members prefer one style and others on the team prefer the other style? Preventing team arguments over mutually exclusive personal preferences has been a longstanding design goal of
elm-format
.)
I think it's beyond the scope of elm-format to decide what the "one way to do it" should be for the elm community. elm-format's goal is to be the best tool given the community's current practices. Debating which way people should do it should be done elsewhere (discourse or slack); For the purposes of this issue, this is the current summary as I understand it:
- It's clear that using nested lambdas with
|>
deeper than 2 levels is a mess with the current formatting - How commonly do Elm users need nested lambdas with
|>
deeper than 2 levels?- original example of using maybes with complicated logic
- OP liked the proposed refactoring better
- JSON decoders
- trivial use of andThen can be replaced with mapN or Json.Decode.Pipeline
- more complicated uses still have this problem
- other examples? might happen with other types having
mapN
andandThen
functions (Result, List, Random, UrlParser)
- original example of using maybes with complicated logic
Open questions for me:
- over the long-term, how commonly does nested lambdas with
|>
deeper than 2 levels stick around? it is common enough to warrant explicit handling in elm-format? - is there agreement in the elm-community about the best way to handle nested lambdas with
|>
deeper than 2 levels?- if yes, elm-format should support that
- if no, elm-format should avoid pushing an arbitrary opinion
- is there even a way to format nested lambdas with
|>
deeper than 2 levels in a nice way that is also consistent with the rest of the elm-format structural primitives?- if not, is this problem big enough that it warrants introducing a new structural primitive for this one case?
I would like to make a small point: if a particular construct becomes easier to use, people will use it. If nested lambda are a pain because of how elm-format indents them, people will get used to the alternative instead.
Elm has already a solution that I think is very good:
(&>) = flip Decode.andThen
mapDecoder : Decoder Map
mapDecoder =
field "name" Decode.string &> \name ->
field "author" Decode.string &> \author ->
field "halfWidth" Decode.int &> \halfWidth ->
field "halfHeight" Decode.int &> \halfHeight ->
field "mainBases" setOfTilesDecoder &> \mainBases ->
field "smallBases" setOfTilesDecoder &> \smallBases ->
field "wallTiles" setOfTilesDecoder &> \wallTiles ->
Decode.succeed
{ name = name
, author = author
, halfWidth = halfWidth
, halfHeight = halfHeight
, mainBases = mainBases
, smallBases = smallBases
, wallTiles = wallTiles
}
It does not require any new syntax and IMHO is the best solution available short of automated decoders.
The two problems are that 1) this solution is currently not compatible with elm-format
and I have no clue how difficult it would be to implement and that 2) in the future we'll need (&>)
to be exposed by Json.Decode
.
A detailed write-up of the impact of the current formatting was added here: https://github.com/avh4/elm-format/issues/734#issuecomment-798966212
This tool we developed at Webbhuset removes extra newlines and indents:
https://gist.github.com/xarvh/1b65cb00e7240f1ccfa0bdbf30f97c62
ie from
eval (App e1 e2) =
State.do (eval e1) <|
\f1 ->
State.do (eval e2) <|
\f2 ->
State.pure (App f1 f2)
to
eval (App e1 e2) =
State.do (eval e1) <| \f1 ->
State.do (eval e2) <| \f2 ->
State.pure (App f1 f2)
It's battle tested and fairly reliable.
This was important for us because we have a very large code base that requires a lot of do-notation. While this is a hack and the clean solution would be to modify/fork elm-format, this does what we need and we can finally run elm-format on all our codebase rather than "only if the module does not use do-notation, and only if I remember, and only if it won't turn all diff history upside down"...