stacks-core icon indicating copy to clipboard operation
stacks-core copied to clipboard

Formal specification of Clarity for Stacks 2.1

Open obycode opened this issue 3 years ago • 9 comments

This issue is moved from https://github.com/hirosystems/clarinet/issues/465.

With the switch to 2.1, we have the opportunity to make some breaking changes to the language. It would make sense to put together a formal specification of the language, including some breaking changes (e.g. limitations on symbol names like this<is>con/fus!ing*). It would be nice to have a BNF specification (or something else).

#2696 is related - it discusses reserved keywords.

Comment from @fariedt on original issue:

I don't have a BNF, but I came up with this using instaparse a few months ago, and it works for pretty much everything I've thrown at it. I had to strip comments, though.

obycode avatar Feb 17 '22 14:02 obycode

Do we need to change anything in the language to get a formal specification?

I think the question of whether or not we need to change symbol names (which I'm skeptical of (*)) is separate from the question of what, if anything, needs to be changed to make a formal specification possible (or having some as-yet-unspecified properties that currently cannot be had).

(*) I think most arguments relating to "this<is>con/fus!ing*" apply to every new programming language. Language idioms are by definition novel and unfamiliar to new users, and learning them is a cost that every new developer is going to pay one way or another. But changing existing symbol names effectively creates two versions of the language, which I think is definitely a worse outcome than sticking to what we have now. Specifically, this would force every experienced developer to re-learn the language for zero gain, and this would reduce the usefulness all prior documentation and code examples (even on the blockchain) as learning tools. Plus, we have to support both versions forever, since all the current chain state uses "version 1" of the language. For a real-world example of how badly this can go, I submit the Python 2/3 fiasco.

jcnelson avatar Feb 21 '22 02:02 jcnelson

When keywords you think should be reserved sometimes are reserved and sometimes aren't making proper formal specification might be tricky.

To give an example. Is define-public a top level function or not?

Documentation says:

Like other kinds of definition statements, define-public may only be used at the top level of a smart contract definition (i.e., you cannot put a define statement in the middle of a function body).

Most developers will also say that it is a top level function.

And I say that it is, and it isn't and to some extend it can be redefined by developer. All at the same time. https://explorer.stacks.co/txid/0xed2e5c51cdd97c04e2eb9172d99b4f8ab5987cae3bd78f201b1135c226993081?chain=testnet https://explorer.stacks.co/txid/0x14865bb4a4f3a0c7cf7671755c656d95b5f89c834876f3ce84b9faefaf907320?chain=testnet https://explorer.stacks.co/txid/0x65b42053e8691383f8ddb6d57a3945c8205a9a78c80a0f56450522ae11ea0a7d?chain=testnet https://explorer.stacks.co/txid/0x36aef2121d9bb01d6a3befb0d6135dc29f29863aadc24557f8eda2728eb3272d?chain=testnet

This is where language should be more restrictive than it is right now.

LNow avatar Feb 21 '22 18:02 LNow

Can we gather here (or in separate ticket) list of all rules that should apply to Clarity alongside with notes if they are already fully enforced by VM or not? This will help create draft of formal specification and allows us to understand how Clarity works, what doesn't work as it should, what needs to be fixed or redefined to make it future proof (as much as it is possible).

I'm talking about stuff like:

  • valid contracts are encoded in UTF-8 and must have unix line end sequence (LF)
  • name of user defined function can't be longer than 128 characters, must start with upper or lower case ascii letter ([A-Za-z]) and can consist upper or lower case ascii leters ([A-Za-z]) and following characters: -!?+<>=/*
  • function name and function arguments in function signature must be separated with at least one whitespace character
  • arguments in function signature must be separated with at least one whitespace character
  • short tuple syntax require at least one whitespace after colon sign
  • it is required to use at least one whitespace between function name and arguments in every single function call
  • trait defined in one contract can be used in the same contract without needing to use use-trait but it can't be aliased
  • function map accepts sequences of a different length, but it always creates list as an output that is no longer than the shortest sequence passed as input argument, and it process first N elements of each of input sequences where N is equal to the length of the shortest input
  • is-eq accepts 1 or more arguments (not 2 or more as documented)
  • functions is-some, is-none, is-ok, is-err can't be used directly as function in map or fold as they operate on complex types, therefore they must be wrapped with user defined function
  • some function names are reserved, while others aren't
  • some functions can be redefined while others can't, but when we redefine built in function VM becomes unpredictable

CC: @obycode @lgalabru @jcnelson @kantai @shea256 @muneeb-ali

LNow avatar Jun 24 '22 16:06 LNow

The new parser will improve on some of these items already. It should solve all of the whitespace issues for example.

obycode avatar Jun 24 '22 16:06 obycode

The thing is that we don't know what should be improved, especially when it comes to things that are not defined. So if you change something in the parser, how can we tell if you are doing it correctly or not?

I strongly believe that language implementation should be done based on language definition/specification, not the other way around.

LNow avatar Jun 24 '22 16:06 LNow

Yes, definitely agree on that point.

obycode avatar Jun 24 '22 17:06 obycode

I strongly believe that language implementation should be done based on language definition/specification, not the other way around.

Indeed. It will also help to document design decisions and their rationale, and open for a more participatory process that benefits from the experience of the broader community instead of just developers involved in the implementation. Design decisions and rationale should be documented, not buried in pull requests.

njordhov avatar Jun 24 '22 18:06 njordhov

Can we gather here (or in separate ticket) list of all rules that should apply to Clarity alongside with notes

It would be beneficial to not allow whitespace after open parens. It will make formatting of Clarity more regular and conventional.

For example, this rule will eliminate misusing whitespace in a let form to make a binding look like a function call (see issue #3176) as in:

(let (
  (try! (try-me fail))

Edit: see issue https://github.com/stacks-network/stacks-blockchain/issues/3198

njordhov avatar Jun 25 '22 15:06 njordhov

I'm definitely supportive of developing a specification for Clarity. I think there's ultimately a bunch of different components of this -- the syntax, the general rules of evaluation, rules for each function and special functions, and maybe more. Some of these are easier to ensure a thorough specification that others. For example, there's well established practices for specifying a complete language syntax, but there's no well established way to specify the "rules of evaluation" (people do make formal VMs, but they are still essentially procedural rather than declarative).

I think collecting a specification over all these things into a document makes sense, and in the places where the specification differs from the Stacks 2.0 implementation (or the Stacks 2.1 implementation), we could highlight that, and ultimately package the document into a SIP (if it contains changes to 2.0 or 2.1, it'll need an activation section).

kantai avatar Jun 27 '22 15:06 kantai

Temporarily assigning to @obycode so someone owns this. Please feel free to re-assign.

jcnelson avatar Feb 22 '23 03:02 jcnelson

I think this may make sense to convert to a discussion -- but if people like this remaining an issue, feel free to push back!

kantai avatar Feb 22 '23 14:02 kantai