dhall-haskell
dhall-haskell copied to clipboard
`dhall format` adds parentheses on toMap annotation that contains `?`
*** Failed! Falsifiable (after 3866 tests and 3 shrinks):
Unicode
Header ""
Annot (ToMap TextShow Nothing) (ImportAlt TextShow (BoolLit False))
"toMap Text/show : Text/show ? False\n" /= "toMap Text/show : (Text/show ? False)\n"
Use --quickcheck-replay=288124 to reproduce.
$ echo 'toMap {=} : a ? b' | dhall format
toMap {=} : (a ? b)
Hmm, I'm no longer so sure about this one:
The initial Expr is an Annot where the annotation may be any expression
annotated-expression = operator-expression [ whsp ":" whsp1 expression ]
It's correct to format this without parentheses:
toMap Text/show : Text/show ? False
But if we parse it, we get an annotated ToMap, where the type annotation is an application-expression that doesn't include un-parenthesized operator-expressions.
> denote <$> exprFromText "" "toMap Text/show : Text/show ? False\n"
Right
( ToMap TextShow
( Just
( ImportAlt TextShow ( BoolLit False ) )
)
)
expression =
...
/ toMap whsp1 import-expression whsp ":" whsp1 application-expression
So, I guess the parser is at fault for accepting an unparenthesized ?-expression as a toMap-annotation?
It seems that toMap {=} : a ? b should actually be parsed as (toMap {=} : a) ? b. Is that right?
@sjakobi: The bug here is that toMap a : b ? c should unambiguously parse as Annot (ToMap a Nothing) (ImportAlt b c), but it's being mistakenly parsed as ToMap a (Just (ImportAlt b c)) due to this logic:
https://github.com/dhall-lang/dhall-haskell/blob/8f2c28a5dcfbf5f0233d4c56e498faf831ceecd7/dhall/src/Dhall/Parser/Expression.hs#L254-L255
However, I don't think we need that hack any more. That's an optimization introduced in https://github.com/dhall-lang/dhall-haskell/pull/606/files from back when we had to fix a major inefficiency bug in the parser. The issue was that when we parsed the beginning of a list literal we weren't sure if it was a List or an Optional literal, so the parser would tentatively commit to a List literal and then fix it later if it saw an Optional type signature. That hack allowed us to left-factor the parser to commit earlier, which greatly improved parsing performance. Now that list literals are unambiguously Lists it's not necessary.
Also, (toMap a : b) ? c would not be a valid parse, because the left argument to ? must be an or-expression which cannot parse toMap a : b without parentheses. Any expression with a type annotation is necessarily an expression according to the grammar.
Since I'm working on related code in https://github.com/dhall-lang/dhall-haskell/pull/1954, I've also had another look at this issue.
Since the core of the problem seems to be that we parse an expression for the annotation instead of just an annotationExpression, I thought I could fix the issue somewhat like this:
diff --git a/dhall/src/Dhall/Parser/Expression.hs b/dhall/src/Dhall/Parser/Expression.hs
index dbf7536f..f843b93d 100644
--- a/dhall/src/Dhall/Parser/Expression.hs
+++ b/dhall/src/Dhall/Parser/Expression.hs
@@ -281,15 +281,23 @@ parsers embedded = Parsers {..}
let alternative5B1 = do
_colon
nonemptyWhitespace
- b <- expression
case (shallowDenote a, a0Info) of
- (ListLit Nothing [], _) ->
+ (ListLit Nothing [], _) -> do
+ b <- expression
+
return (ListLit (Just b) [])
- (Merge c d Nothing, NakedMergeOrSomeOrToMap) ->
+ (Merge c d Nothing, NakedMergeOrSomeOrToMap) -> do
+ b <- applicationExpression
+
return (Merge c d (Just b))
- (ToMap c Nothing, NakedMergeOrSomeOrToMap) ->
+ (ToMap c Nothing, NakedMergeOrSomeOrToMap) -> do
+ b <- applicationExpression
+
return (ToMap c (Just b))
- _ -> return (Annot a b)
+ _ -> do
+ b <- expression
+
+ return (Annot a b)
alternative5B0 <|> alternative5B1 <|> pure a
This passes the testsuite, but completely breaks parsing of the expression from this issue:
$ echo 'toMap x : y ? z' | cabal run -- dhall:dhall haskell-syntax-tree
(input):1:13:
|
1 | toMap x : y ? z
| ^
unexpected '?'
expecting end of input or whitespace
I tried to sprinkle some trys in there but that didn't change the result…
@sjakobi: A small hack that might fix some cases is to do a lookahead parse for the opening [ of a list literal. If that lookahead parse fails then you can confidently parse the next expression as an applicationExpression. This won't fix all cases, but it might help some of them.