plutus icon indicating copy to clipboard operation
plutus copied to clipboard

Conform uplc parsing to spec

Open carloskiki opened this issue 2 months ago • 7 comments

From the specification:

Note. At the time of writing the syntax accepted by IOG’s parser for textual Plutus Core differs slightly from that above in that subobjects of Constr, Map and List objects must not be parenthesised: for example one must write (con data (Constr 1 [I 2, B #,Map []]). This discrepancy will be resolved in the near future.

Is it possible to address this shortcoming?

carloskiki avatar Oct 12 '25 19:10 carloskiki

At least for cases where we have a primitive list of data, for example: https://github.com/IntersectMBO/plutus/blob/b12c894833cae725bab11577c845701aeb05dc97/plutus-conformance/test-cases/uplc/evaluation/builtin/semantics/listData/listData.uplc#L2

I would expect each data instance in the list to be wrapped in parens. I can understand if there are no parens in the case for Data's List or Map variants (because we are already parsing a Data constant), but for a primitive list having data not wrapped in parens is inconsistent.

carloskiki avatar Oct 12 '25 19:10 carloskiki

We use Haskell-like syntax for constants and I don't see any reason to be wrapping those subjects into parens.

Would you expect a list of integers to require parens like this: (con (list integer) [(1), (2), (3)]?

I don't mind supporting parens there though, why not. Except that somehow seems to make GHC loop lol:

diff --git plutus-core/plutus-core/src/PlutusCore/Parser/Builtin.hs plutus-core/plutus-core/src/PlutusCore/Parser/Builtin.hs
index a35178aaa..b68f94e0f 100644
--- plutus-core/plutus-core/src/PlutusCore/Parser/Builtin.hs
+++ plutus-core/plutus-core/src/PlutusCore/Parser/Builtin.hs
@@ -24,7 +24,7 @@ import Data.Text qualified as T
 import Data.Text.Internal.Read (hexDigitToInt)
 import Data.Vector.Strict (Vector)
 import Data.Vector.Strict qualified as Vector
-import Text.Megaparsec (customFailure, getSourcePos, takeWhileP)
+import Text.Megaparsec (customFailure, getSourcePos, takeWhileP, try)
 import Text.Megaparsec.Char (char, hexDigitChar, string)
 import Text.Megaparsec.Char.Lexer qualified as Lex
 
@@ -144,7 +144,7 @@ conBLS12_381_G2_Element = do
 
 -- | Parser for constants of the given type.
 constantOf :: ExpectParens -> DefaultUni (Esc a) -> Parser a
-constantOf expectParens uni =
+constantOf expectParens uni = try (constantOf ExpectParensYes uni) <|>
   case uni of
     DefaultUniInteger                                                 -> conInteger
     DefaultUniByteString                                              -> conBS

I'll ask the team.

effectfully avatar Oct 17 '25 04:10 effectfully

Eek, now that I looked at that diff again, I realize it's not sufficient in the general case, although it's probably sufficient for what you're asking for. I'll try something else.

effectfully avatar Oct 17 '25 04:10 effectfully

I am strictly looking at data inside of constants.

My claim is that since (con data (I 0)) requires parentheses around the value (I 0), it should do the same inside of lists and pairs. That would not be for all types, only for data, since only data requires parentheses around it.

I would expect either row to be the syntax:

data list data pair data data
No Parens (con data I 0) (con (list data) [I 0, B #1234]) (con (pair data data) (I 0, B #1234))
Parens (con data (I 0)) (con (list data) [(I 0), (B #1234)]) (con (pair data data) ((I 0), (B #1234)))

Currently, Parens are required for standalone data constants, but they are forbidden for data inside of lists and arrays. I understand that this is not wanted (feel free to close the issue), but I was inquiring since the spec states that this should be fixed eventually.

carloskiki avatar Oct 17 '25 14:10 carloskiki

We worried about this when we originally added concrete syntax for data objects (almost two years ago now!) but never got round to fixing it.

kwxm avatar Oct 18 '25 14:10 kwxm

Thanks, I will close this issue for now then.

carloskiki avatar Oct 18 '25 19:10 carloskiki

I mean, I don't mind adding support for that to the parser, here's a PR: #7391

effectfully avatar Oct 19 '25 12:10 effectfully