Feature proposal: Functions
Parameterised blocks
Blocks enclose a set of mappings:
{
/abc/ -> $t1;
/xyz/ -> $t2;
}
Defining a parameterised block of code:
f(x, y) = {
/abc/ -> $t1;
x /xyz/ | y -> $t2;
}
Then you could invoke f(),
which would behave just as if you typed it out by hand:
f(/xxx/, /yyy/);
or use it as a zone body:
'b' .. 'e' f(/xxx/, /yyy/);
which is exactly equivalent to writing:
'b' .. 'e' {
f(/xxx/, /yyy/);
}
f() would be usable in expressions, giving its unioned DFA:
'abc' - f('xxx', 'yyy') -> $t;
f() cannot be used as the body for one-way zones,
because that would be syntactically ambiguous with the use in expressions.
However, you could write:
'x' {
f('xxx', 'yyy');
}
to the same effect.
Parameterised expressions
g(x, y) = /abc/ x | ~y;
invoked in an expression just the same:
'abc' - g('x', 'y') -> $t;
but (as with writing /abc/; with no token mapping),
this would cause input to be consumed and no token emitted:
g(x, y);
Removing parameterisation
Parameters may be omitted, and the syntax degrades to:
f = {
/abc/ -> $t1;
/xyz/ -> $t2;
}
g = /abc/ | 'x'; # note this is the existing variable binding syntax
which may be used in the same ways:
f;
'b' .. 'e' f;
'abc' - f -> $t;
'abc' - g -> $t;
g;
Hence f and g are distinguished by type.
Neither f nor g may be used on the right hand side of a mapping.
Token names as parameters
So far the examples have shown FSM as parameters.
Currently using a token's value gives its FSM:
'if' | 'else' | 'for' -> $kw;
/[a-z]+/ - $kw -> $ident;
Parameter lists also permit token names as values; this is indistinguishable from passing an FSM as a parameter, where the FSM happens to come from an existing token as above.
f(x, y, t1) = {
/abc/ - x - y - t1 -> $t;
}
g(t) = /abc/ - t;
and passing a token by name for mappings:
f(x, y, t2) = {
/abc/ - x - y -> t2;
}
where f('xxx', 'yyy', $t); is known to mean the name of $t,
rather than its FSM value, due to using t2 on the rhs of a mapping in f().
Hence parameters are distinguished by type.