alloy icon indicating copy to clipboard operation
alloy copied to clipboard

Proposal: Alloy syntax conditionals

Open rfratto opened this issue 1 year ago • 10 comments

Background

Conditionals for the Alloy syntax were initially proposed by @tpaschalis on Dec 20, 2022 (#157).

In Alloy's current state, conditionals would mainly reduce the need for a templating engine when deploying Alloy configs statically. While this is an interesting use case, it's difficult to justify over other priorities, as a templating engine can still be used.

However, when considered alongside other proposals like #156, conditionals become a requirement for dynamic transformation and data routing, as acting on streams of data is not something that can be achieved with templating engines. Imagine a hypothetical prometheus.route component that can inspect incoming metrics and route to different components:

prometheus.scrape "default" {
  targets    = ...
  forward_to = [prometheus.route.prod.receiver]
}

// prometheus.route inspects incoming metrics and uses their 
// labels to determine which component to forward them to. 
prometheus.route "prod" {
  forward_func = func(metric) => (
    if metric.namespace == "prod" then [prometheus.remote_write.prod.receiver]
    else if metric.namespace == "qa" then [prometheus.remote_write.qa.receiver]
    else [prometheus.remote_write.dev.receiver] 
    end 
  )
}

prometheus.remote_write "prod" { ... }
prometheus.remote_write "qa" { ... }
prometheus.remote_write "dev" { ... }

Even though conditionals are difficult to prioritize based on current functionality, adding support for conditionals now will allow us to build a framework for more advanced pipelines and set a precedence for how we can explore syntactical experiments using the --stability.level flag.

This proposal supersedes #157.

Please use reactions to vote on this proposal, with :+1: to signify support and :-1: to signify disapproval. When disapproving, a comment would be appreciated so we can discuss and refine the proposal as needed.

Proposal

This proposal has the following goals:

  • Enable conditionals within an Alloy expression. Conditional statements (such as a conditional block) is out of scope.
  • Balance readability and terseness for long chains of conditionals.

Syntax

Following the exploration in #157, the (ENBF) grammar for conditionals is as follows:

CondExpr       = "if" Expression "then" Expression 
                 { CondElseIfExpr } [ CondElseExpr ] "end"
CondElseIfExpr = "else" "if" Expression "then" Expression 
CondElseExpr   = "else" Expression

// CondExpr is a type of PrimaryExpr, giving it one of the lowest precedences.
PrimaryExpr = LiteralValue | ArrayExpr | ObjectExpr | CondExpr  

Here, the tokens if, then, else, and end are selected to align with natural language for how developers communicate conditionals. This makes conditionals slightly more approachable for non-developers compared to more obscure tokens like ? and : and can aid in making conditionals easier to grok for first-time readers.

While the then keyword is more verbose, it is easier to parse correctly as it strictly divides the "if" part of the expression with the value.

Additionally, terminators are not required within the grammar for conditionals; this allows writers to place newlines arbitrarily for readability.

CondExpr is parsed at the same levels as literals. This allows CondExpr to appear anywhere as part of an expression. In some cases this may hurt readability, such as adding the result of two conditionals together. The formatter will be responsible for producing a readable output; see the Style and Formatting section of this proposal for more information.

The proposed grammar would parse all of these expressions:

// Single if 
if CONDITION then VALUE end 

// If/else 
if CONDITION then VALUE else VALUE end 

// If/Else-if chain 
if CONDITION then VALUE 
else if CONDITION then VALUE  
else if CONDITION then VALUE 
end 

// If/Else-if/Else 
if CONDITION then VALUE 
else if CONDITION then VALUE  
else if CONDITION then VALUE 
else VALUE 
end 

// Alternative form for more readibility 
if CONDITION then 
  VALUE 
else if CONDITION then 
  VALUE 
else if CONDITION then 
  VALUE 
else 
  VALUE 
end 

Evaluation

The syntax evaluator will iterate through the conditionals within the expression and return the first value where the condition is true. If no conditions are true, the else value will be returned. If there is no else value, the expression evaluates to null.

If a conditional in the overall expression is not a boolean value, the entire expression will fail to evaluate:

if a == 5 then 
  a * 2
else if 6 then // INVALID: expected 6 to be a boolean, got number 
  a * 3 
end 

Style and formatting

The formatter will mostly retain the conditional as written by the user, including with newlines. However, if a conditional expression is used as part of a more complex expression (binary operation, etc.), the formatter will surround the conditional in parenthesis for readability (if it is not already).

// Before formatting 
if a > 5 then a * 2 else a end + 10 

// After formatting 
(if a > 5 then a * 2 else a end) + 10 

When the formatter surrounds a conditional in parenthesis, if that conditional crosses multiple lines, the surrounding parenthesis are placed on new lines and the inner conditional is indented:

// Before formatting 
if a > 10 then [0, 1, 2]
else if a > 5 then [3, 4, 5]
else [6, 7, 8]
end [0]  

// After formatting 
(
  if a > 10 then [0, 1, 2]
  else if a > 5 then [3, 4, 5]
  else [6, 7, 8]
  end 
)[0] 

These parenthesis will be injected by the formatter if both of the following are true:

  • The conditional expression is not directly inside a parenthetical expression.
  • The conditional expression is part of a larger expression (such as a binary operator, unary operator, object access, etc.).

Delivery

While it is unlikely support for conditionals will be removed, they should not be released immediately as Generally Available to give us time to iterate and explore the syntax, and potentially make breaking changes if needed. The initial delivery of conditionals will be tagged as public preview and will require passing --stability.level=public-preview or --stability.level=experimental for conditional support.

If the stability level is not set properly, errors will be returned in three places:

  • The scanner will return an error when encountering a token specific to conditionals (if, else, then, end).
  • Additionally, the parser will return an error if a conditional-specific token is scanned.
  • Additionally, the syntax evaluator will return an error if a conditional expression found in the AST.

The bottom two conditions are technically unnecessary, but a defensive implementation is useful towards ensuring that users do not get access to a feature which is not GA yet.

The alloy fmt command disables these checks to allow conditionals to be parsed and formatted regardless of whether the Alloy instance supports them.

Alternatives considered

Refer to #157 for alternative approaches that were considered.

Acknowledgements

Credit to @tpaschalis for the work on the original proposal which formed the basis for this one.

rfratto avatar May 10 '24 14:05 rfratto

Are user defined funcs a precursor to conditionals? Granted we can do them separately but I think they would have more limited use unless we defined func beforehand. Or is a better example:

prometheus.remote_write "test" {
  url = if env("ENVIRONMENT") == "QA" then "http://qa" else "http://dev" end,
}

mattdurham avatar May 10 '24 15:05 mattdurham

Are user defined funcs a precursor to conditionals? Granted we can do them separately but I think they would have more limited use unless we defined func beforehand.

I touch on this in the proposal:

Even though conditionals are difficult to prioritize based on current functionality, adding support for conditionals now will allow us to build a framework for more advanced pipelines and set a precedence for how we can explore syntactical experiments using the --stability.level flag.

rfratto avatar May 10 '24 15:05 rfratto

Is nesting conditional statements not allowed? if CONDITION then if CONDITION then VALUE else VALUE end else if VALUE end

wildum avatar May 10 '24 15:05 wildum

Can we change the example to only use functionality that exists today + what you are suggesting? Might give the wrong direction on what this proposal enables.

mattdurham avatar May 10 '24 15:05 mattdurham

I believe it's important to acknowledge how conditions will be fundamental in the future. In total isolation I don't think we can prioritize them. This is a bit of a chicken and egg problem; if we do the more powerful stuff first it's hampered because conditionals don't exist, but conditionals on their own aren't extremely useful without the more powerful capabilities.

rfratto avatar May 10 '24 15:05 rfratto

Is nesting conditional statements not allowed? if CONDITION then if CONDITION then VALUE else VALUE end else if VALUE end

It is valid, a CondExpr is a non-terminal for Expression. I'll make this more clear in the grammar.

rfratto avatar May 10 '24 15:05 rfratto

Do you think we should add an explicit return keyword? Thinking of if we implement variables, it might be helpful and clearer. In general approve of the idea and implementation.

mattdurham avatar May 10 '24 15:05 mattdurham

You can already use custom components as variables. With conditionals this opens the door to many twisted scenarios 😄

wildum avatar May 10 '24 15:05 wildum

@wildum I expanded the grammar above to show the precedence of conditionals, and added a section for formatting to make it clear how the formatter will enhance readability. I'm going to resolve the earlier comments about nesting if expressions now :)

rfratto avatar May 10 '24 15:05 rfratto

Thanks for picking this up again! I'm in favor of the proposal as is it presented here and excited to see it move ahead!

tpaschalis avatar May 14 '24 15:05 tpaschalis

Did you consider a syntax that will more closely match the existing curly-braces style that is already present in Alloy files?

// proposed by you

some.component "foo" {
  attr = if cond1 then
    val1
  else if cond2
    val2
  else
    val3
  end
}

// maybe consider this?

some.component "foo" {
  attr = if (cond1) {
    val1
  } else if (cond2) {
    val2
  } else {
    val3
  }
}

I feel that the former is like a mixture of two languages and has more keywords. Not sure if it's Python/Pascal or JS/C. The latter is more consistent.

Other than that, I think we need this to make Alloy cover some important use cases, so let's do it! Thanks Robert and Paschalis.

thampiotr avatar May 28 '24 15:05 thampiotr

I'm not dramatically opposed to curly brace syntax (though maybe whether the conditional needs to be wrapped in parenthesis is up for more debate).

While conditional blocks aren't in scope for this proposal, I believe they will quickly follow suit after conditional expressions are supported. Curly braces for conditional blocks may feel more natural with the curly brace syntax:

if env("ENVIRON") == "prod" {
  prometheus.remote_write "prod" {
    ...
  }
} 

as opposed to the current syntax:

if env("ENVIRON") == "prod" then
  prometheus.remote_write "prod" {
    ...
  }
end

If conditional blocks should use curly braces, that would influence whether conditional expressions should use curly braces as well.

Overall, I think these are the tradeoffs of the curly brace syntax for expressions:

  • Pros:
    • Shares tokens with block statements and object literals; the only new tokens introduced would be if and else, compared to if, else, then, and end.
    • Can be more familiar to programmers of declarative languages.
    • Leads in more naturally to conditional blocks, where curly braces may feel more natural within the context of an Alloy config.
  • Cons:
    • As mentioned in the original proposal, curly braces may be confused with normal blocks when scanning the file.
    • Similarly to above, curly braces may make config authors think multiple statements can be placed inside the curly braces, but in reality only a single expression is valid.
    • Can be less familiar to non-developers; if a == 5 then "hello" else "goodbye" can read similarly to English, whereas if a == 5 { "hello" } else { "goodbye" } might look more arcane to non-developers.

rfratto avatar May 28 '24 16:05 rfratto

I am very slightly for the verbose then end since I think it helps differentiate it from blocks but its only by a thin margin. I think @bentonam may have some good insight since he has likely written the most code that would benefit from this.

mattdurham avatar May 28 '24 17:05 mattdurham

My opinion is that the Alloy configuration itself is predicated on { and } to clearly define the beginning and end of components and blocks. I think that conditionals should implement this same pattern as well, as a conditional is just a sub-block of code with a clear beginning and end. I also think this would make determining the AST and syntax highlighting simpler, although that is purely a guess.

If the syntax if, else, then, and end are used as strings, would that extend to operators? For example, would and, or, eq, neq, gt, gte, lt, lte, etc., be used? I would venture to guess no, and that symbols would be used instead: &&, ||, ==, !=, >, >=, <, <=.

I also feel that if conditionals are going to be implemented, which I believe they should be, support for a ternary operator should be added at the same time. This would provide a shorthand for what the vast majority of conditionals would be used for anyway.

For example:

condition ? 'true_value' : 'false_value';

In a practical application:

// prometheus.route inspects incoming metrics and uses their 
// labels to determine which component to forward them to. 
prometheus.route "prod" {
  forward_to = [metric.labels.namespace == "prod" ? prometheus.remote_write.prod.receiver : prometheus.remote_write.default.receiver]
}

However, maybe a dynamic reference could be used instead:

// prometheus.route inspects incoming metrics and uses their 
// labels to determine which component to forward them to. 
prometheus.route "prod" {
  forward_to = [prometheus.remote_write[metric.labels.namespace].receiver]
}

Additionally, I could see a need for case statements as part of conditionals. This would allow for more complex routing and decision-making within the configuration.

bentonam avatar May 28 '24 18:05 bentonam

I mocked up the proposed grammar for conditionals using curly braces (without any parenthesis around the conditional statement but this message stands either way):

CondExpr       = "if" CondBody CondValue { CondElseIfExpr } [ CondElseExpr ]
CondBody       = Expression 
CondValue      = "{" Expression "}"
CondElseIfExpr = "else" "if" CondBody CondValue
CondElseExpr   = "else" CondValue

This led me to notice that CondValue is very similar to object literals. I created a (somewhat contrived) example comparing the two proposed syntaxes:

  1. With the if/then/else syntax:

    prometheus.remote_write "default" {
      // Parenthesis added for readability.
      external_labels = (
        if env("IS_PROD") == "1" then 
          {
            cluster = "prod", 
            region  = "us-east-1",   
          } 
        else 
          {
            cluster = "non-prod",
            region  = "us-west-1", 
          } 
        end 
      )
    }
    
  2. With curly braces denoting the expression value:

    prometheus.remote_write "default" {
      // Parenthesis added for readability.
      external_labels = (
        if env("IS_PROD") == "1" { 
          {
            cluster = "prod", 
            region  = "us-east-1",   
          } 
        } else { 
          {
            cluster = "non-prod",
            region  = "us-west-1", 
          } 
        }
      )
    }
    

I'm concerned that a conditional whose value is an object literally will be somewhat common, and that using curly braces for the conditional value is prone to mistyping:

prometheus.remote_write "default" {
  // Parenthesis added for readability.
  external_labels = (
    // This is wrong; the outer {} is missing. 
    if env("IS_PROD") == "1" { 
      cluster = "prod",        
      region  = "us-east-1", 
    } else { 
      cluster = "non-prod",
      region  = "us-west-1",
    }
  )
}

This could be mitigated in two ways:

  • The parser could be aware of this pitfall and create a better error message when an "=" is found inside the body of a conditional. However, the implementation complexity of doing so is slightly beyond trivial.
  • A return statement could be added (as suggested by @mattdurham) to more clearly denote the value. However, this is adding complexity not necessary with the if/then/else approach.

For this reason, I personally think the cons of curly braces now outweigh the pros, and I currently prefer the if/then/else syntax as originally proposed.


I also feel that if conditionals are going to be implemented, which I believe they should be, support for a ternary operator should be added at the same time

@bentonam I can understand the desire for the ternary operator as a familiar concept, but I would prefer to minimize the number of ways one can do conditionals to keep the learning curve as shallow as possible.

For the same reason, I'm not sure about switch/case statements; I'm not aware of a use case that isn't covered by this proposal (regardless of the syntax) and that can only be covered by ternary operators or switch/case statements.

rfratto avatar May 28 '24 19:05 rfratto

I've just found an ambiguity with current proposed grammar:

CondExpr       = "if" Expression "then" Expression 
                 { CondElseIfExpr } [ CondElseExpr ] "end"
CondElseIfExpr = "else" "if" Expression "then" Expression 
CondElseExpr   = "else" Expression

// CondExpr is a type of PrimaryExpr, giving it one of the lowest precedences.
PrimaryExpr = LiteralValue | ArrayExpr | ObjectExpr | CondExpr  

CondElseIfExpr and CondElseExpr are ambiguous, as both derive the token sequence else if. Given our grammar is supposed to only require a single lookahead token, this is an issue.

I can think of a few solutions for this:

  1. Introduce a new token for CondElseIfExpr such as elif, though that would start to defeat the purpose of using English keywords for conditionals:

    CondExpr       = "if" Expression "then" Expression 
                     { CondElseIfExpr } [ CondElseExpr ] "end"
    CondElseIfExpr = "elif" Expression "then" Expression 
    CondElseExpr   = "else" Expression
    
    // CondExpr is a type of PrimaryExpr, giving it one of the lowest precedences.
    PrimaryExpr = LiteralValue | ArrayExpr | ObjectExpr | CondExpr  
    
    if CONDITION1 then VALUE1 
    elif CONDITION2 then VALUE2  
    elif CONDITION3 then VALUE3
    else VALUE4 
    end
    
  2. Remove the "else if" concept from the grammar. This means that there would only be "if" and "else", but the else can chain into another conditional:

    // user code: 
    
    if CONDITION1 then VALUE1 
    else if CONDITION2 then VALUE2  
    else if CONDITION3 then VALUE3
    else VALUE4 
    
    // parses as: 
    
    if CONDITION1 then VALUE1 
    else (
      if CONDITION2 then VALUE2
      else (
        if CONDITION3 then VALUE3 
        else VALUE4 
      )
    ) 
    

    This also requires dropping the end keyword from the grammar, otherwise you would need one end per else-if:

    CondExpr       = "if" Expression "then" Expression [ CondElseExpr ]
    CondElseExpr   = "else" Expression
    
    // CondExpr is a type of PrimaryExpr, giving it one of the lowest precedences.
    PrimaryExpr = LiteralValue | ArrayExpr | ObjectExpr | CondExpr  
    

    I do not think the end keyword is necessary in terms of resolving ambiguity, so it should be safe to remove.

    With this approach, the grammar simplification would not necessarily change the AST, as the parser could refactor the if-else-if chains into a single AST element.

  3. The parser can try to work away around this ambiguity, at the cost of more complex code and not being able to use a single lookahead token.

This ambiguity exists for the suggested curly brace syntax too, and the solutions above can be applied for the curly brace syntax as well.

Currently I'm leaning towards the second solution since it avoids introducing even more tokens and in the case of if/then, removing the "end" keyword is a nice bonus.

rfratto avatar May 28 '24 20:05 rfratto

Chiming in with my 2c:

In regard to the braces discussion; I see the arguments from both sides, but the topic of readability and the fact that a single statement is only ever allowed makes me lean towards the no-braces option. I think what we choose here may impact syntax in https://github.com/grafana/alloy/issues/156 as well so we'd be potentially be introducing even more braces scattered around.

~In regards to the elif/no else if discussion, I find that unfortunate. The root issue here that we only require a single lookahead token but else if would be two separate tokens? From the two options presented here, I think the elif option makes more sense; the deep nesting required for even relative simple operations would be easy to break and harder to read IMO.~

Sorry, I've re-read the user-facing part of your comment and how the user code would look and I think I agree.

tpaschalis avatar May 31 '24 08:05 tpaschalis

I've made the following changes to the proposal:

  1. I resolved the grammatical ambiguity first noticed in this hidden comment by dropping the "else if" production from the grammar and dropping the "end" keyword. There are no differences to how users would write conditionals with the exception of no longer writing the end keyword.

  2. Because there is no explicit end marker for conditionals, the section about style and formatting has become stale and was removed; the style and formatting section only applied to making explicit end markers more readable.

    If consensus moves towards the curly brace form, the style and formatting section would need to be reintroduced, as the curly brace form has an explicit end marker (the closing curly brace }).

For now, I still believe that we should not change to curly brace syntax given my reasoning above.

I also understand the desire for other conditional syntaxes like ternary operator and switch statements, but I believe those are separate proposals and aren't in scope for this discussion.

cc @mattdurham Does my reasoning above make sense to you? Are you still leaning more towards if/then/else?

cc others involved in the discussion: @tpaschalis @bentonam @wildum

rfratto avatar May 31 '24 13:05 rfratto

I see 2 paths forward I would have preference toward. We should take into account what conditional blocks may require and we should be consistent for both syntaxes of conditionals unless there is a very good reason not to. I've provided an example for my 2 preferred options when it comes to curly or not that I think will work for both expressions and blocks. The examples below are for blocks for consideration.

Curly Braces (CB)

if constants.os == "windows" {
  prometheus.exporter.windows {
    ...
  }

  // additional components
} else if constants.os == "linux" {
  promtheus.exporter.unix {
    ...
  }

  // additional components
}

If Else Elif End (IEEE)

Dropping the end so that we can keep else if limits us to a single expression and falls down for block conditionals that want to include multiple blocks which I expect to be a common use case. In my searching, comparable languages that don't do something comparable to curly braces all make use of elif or an equivalent.

if constants.os == "windows" then
  prometheus.exporter.windows {
    ...
  }

  // additional components
elif constants.os == "linux" then
  promtheus.exporter.unix {
    ...
  }

  // additional components
end

Opinion Time

It's a bit hard to be objective from here and I recognize the following statements are not:

  • CB feels more similar to block and module declares
  • IEEE is closer to english but requires someone to know which of the commonly used else if usages are valid such as elif vs elsif vs elseif
  • Either option is legible to me and reasonable for someone creating conditional config to read/write

If you twisted my arm today, I would probably vote for curly braces. I don't find the closer to english argument for IEEE alone super compelling in this case since we are talking about if conditionals that are an entry level programming concept and commonly done with CB in other languages. That said, I'm 99% happy to move forward with IEEE as long as we don't drop the end.

erikbaranowski avatar May 31 '24 18:05 erikbaranowski

After a long helpful chat with @erikbaranowski and @hedss, I realized there's multiple factors impacting how we design conditional values:

  • The concept of conditionally defining blocks ("conditional blocks")
  • The concept of "component scheduling" where you can disable components on non-owning nodes within a cluster
  • Optionally supporting Alloy configuration via YAML syntax

These other goals are relevant to conditional value because we aim for consistency and minimalism with Alloy's syntax, but none of these goals have been formally proposed yet.

Our discussion about curly braces has largely depended on conjecture for how conditional blocks might be designed. I'm concerned designing on conjecture can be particularly bad for syntax, as syntax is hard to change and conjecture may lead to weaker choices based on scenarios that never happen.

As my proposal is now, I think it's under-baked and doesn't give enough consideration to the other goals we have that impact the design of conditionals. For now, I'm going to close this proposal and give time for some of the other goals to be solved first. I'll return later with a refreshed proposal once it becomes more feasible to evaluate the conditionals design, or until I have time to write a more complete proposal that can solve multiple problems at once.

rfratto avatar May 31 '24 18:05 rfratto