dhall-haskell icon indicating copy to clipboard operation
dhall-haskell copied to clipboard

`dhall format` adds parentheses on toMap annotation that contains `?`

Open sjakobi opened this issue 6 years ago • 5 comments
trafficstars

      *** Failed! Falsifiable (after 3866 tests and 3 shrinks):
      Unicode
      Header ""
      Annot (ToMap TextShow Nothing) (ImportAlt TextShow (BoolLit False))
      "toMap Text/show : Text/show ? False\n" /= "toMap Text/show : (Text/show ? False)\n"
      Use --quickcheck-replay=288124 to reproduce.
$ echo 'toMap {=} : a ? b' | dhall format
toMap {=} : (a ? b)

sjakobi avatar Oct 23 '19 04:10 sjakobi

Hmm, I'm no longer so sure about this one:

The initial Expr is an Annot where the annotation may be any expression

annotated-expression = operator-expression [ whsp ":" whsp1 expression ]

It's correct to format this without parentheses:

toMap Text/show : Text/show ? False

But if we parse it, we get an annotated ToMap, where the type annotation is an application-expression that doesn't include un-parenthesized operator-expressions.

> denote <$> exprFromText "" "toMap Text/show : Text/show ? False\n"
Right 
    ( ToMap TextShow 
        ( Just 
            ( ImportAlt TextShow ( BoolLit False ) )
        )
    )
expression =
  ...
  / toMap whsp1 import-expression whsp ":" whsp1 application-expression

So, I guess the parser is at fault for accepting an unparenthesized ?-expression as a toMap-annotation?

sjakobi avatar Oct 23 '19 15:10 sjakobi

It seems that toMap {=} : a ? b should actually be parsed as (toMap {=} : a) ? b. Is that right?

sjakobi avatar Oct 23 '19 20:10 sjakobi

@sjakobi: The bug here is that toMap a : b ? c should unambiguously parse as Annot (ToMap a Nothing) (ImportAlt b c), but it's being mistakenly parsed as ToMap a (Just (ImportAlt b c)) due to this logic:

https://github.com/dhall-lang/dhall-haskell/blob/8f2c28a5dcfbf5f0233d4c56e498faf831ceecd7/dhall/src/Dhall/Parser/Expression.hs#L254-L255

However, I don't think we need that hack any more. That's an optimization introduced in https://github.com/dhall-lang/dhall-haskell/pull/606/files from back when we had to fix a major inefficiency bug in the parser. The issue was that when we parsed the beginning of a list literal we weren't sure if it was a List or an Optional literal, so the parser would tentatively commit to a List literal and then fix it later if it saw an Optional type signature. That hack allowed us to left-factor the parser to commit earlier, which greatly improved parsing performance. Now that list literals are unambiguously Lists it's not necessary.

Also, (toMap a : b) ? c would not be a valid parse, because the left argument to ? must be an or-expression which cannot parse toMap a : b without parentheses. Any expression with a type annotation is necessarily an expression according to the grammar.

Gabriella439 avatar Oct 24 '19 04:10 Gabriella439

Since I'm working on related code in https://github.com/dhall-lang/dhall-haskell/pull/1954, I've also had another look at this issue.

Since the core of the problem seems to be that we parse an expression for the annotation instead of just an annotationExpression, I thought I could fix the issue somewhat like this:

diff --git a/dhall/src/Dhall/Parser/Expression.hs b/dhall/src/Dhall/Parser/Expression.hs
index dbf7536f..f843b93d 100644
--- a/dhall/src/Dhall/Parser/Expression.hs
+++ b/dhall/src/Dhall/Parser/Expression.hs
@@ -281,15 +281,23 @@ parsers embedded = Parsers {..}
                     let alternative5B1 = do
                             _colon
                             nonemptyWhitespace
-                            b <- expression
                             case (shallowDenote a, a0Info) of
-                                (ListLit Nothing [], _) ->
+                                (ListLit Nothing [], _) -> do
+                                    b <- expression
+
                                     return (ListLit (Just b) [])
-                                (Merge c d Nothing, NakedMergeOrSomeOrToMap) ->
+                                (Merge c d Nothing, NakedMergeOrSomeOrToMap) -> do
+                                    b <- applicationExpression
+
                                     return (Merge c d (Just b))
-                                (ToMap c Nothing, NakedMergeOrSomeOrToMap) ->
+                                (ToMap c Nothing, NakedMergeOrSomeOrToMap) -> do
+                                    b <- applicationExpression
+
                                     return (ToMap c (Just b))
-                                _ -> return (Annot a b)
+                                _ -> do
+                                    b <- expression
+
+                                    return (Annot a b)
 
                     alternative5B0 <|> alternative5B1 <|> pure a
 

This passes the testsuite, but completely breaks parsing of the expression from this issue:

$ echo 'toMap x : y ? z' | cabal run -- dhall:dhall haskell-syntax-tree
(input):1:13:
  |
1 | toMap x : y ? z
  |             ^
unexpected '?'
expecting end of input or whitespace

I tried to sprinkle some trys in there but that didn't change the result…

sjakobi avatar Jul 30 '20 01:07 sjakobi

@sjakobi: A small hack that might fix some cases is to do a lookahead parse for the opening [ of a list literal. If that lookahead parse fails then you can confidently parse the next expression as an applicationExpression. This won't fix all cases, but it might help some of them.

Gabriella439 avatar Jul 30 '20 01:07 Gabriella439