[bug or feature?] default action `$$ = $1` is also executed when the rule has an explicit action block; this is not cf. bison documentation
From the bison docs @ http://dinosaur.compilertools.net/bison/bison_6.html#IDX85 (emphasis mine):
If you don't specify an action for a rule, Bison supplies a default:
$$ = $1. Thus, the value of the first symbol in the rule becomes the value of the whole rule. Of course, the default rule is valid only if the two data types match. There is no meaningful default action for an empty rule; every empty rule must have an explicit action unless the rule's value does not matter.
Technically, jison doesn't do this: it always executes the default action $$ = $1 inside the parser kernel and then lets the user-defined action code overwrite the $$ by executing the "default action" before executing the rule's action code inside the generated performAction() function.
When the userland action code doesn't modify the $$ value, it will remain set to equal the $1 value, rather than making it undefined instead.
Example grammar snippet:
slist : slist stmt ';' { console.log("** not touching $$!\n"); }
| stmt ';' {
console.log("** neither am I!\n");
}
;
Suggested solution v1.0
Modify jison to inspect the grammar rules' action code blocks (which jison handles anyway while it places them inside performAction() and:
- replace the current kernel 'default action' code with
$$ = undefined. WARNING! SEE SECTION FURTHER BELOW! THIS IS NOT A GOOD IDEA! - add the default action code
$$ = $1to any non-empty rule which does not have any action at all - add the default action code
$$ = undefinedto any empty rule, so that the behaviour is always predictable and$$is set to a known value after eachreduceaction - prefix any user-defined action code block which does not set
$$with the action code$$ = undefined. WARNING! SEE SECTION FURTHER BELOW! THIS IS NOT A GOOD IDEA!
Why the initial suggestion is not a smart idea to do
Quoting from the same bison section a bit more:
$nwithnzero or negative is allowed for reference to tokens and groupings on the stack before those that match the current rule. This is a very risky practice, and to use it reliably you must be certain of the context in which the rule is applied. Here is a case in which you can use this reliably:foo: expr bar '+' expr { ... } | expr bar '-' expr { ... } ; bar: /* empty */ { previous_expr = $0; } ;As long as
baris used only in the fashion shown here,$0always refers to theexprwhich precedesbarin the definition offoo.
As I want to support this 'obscure' feature too, the previous changeset would obliterate $0 because $$ === $0 due to the nature of the parse stack.
Still we need to change jison such that the default action $$ = $1 only happens when there is no user-defined action for the given rule at all; the suggested default action for all other situations, in order to produce a deterministic $$ behaviour on reduce, is to set $$ = undefined.
Besides, the changeset above would require intricate user code (action code chunk) analysis, which is no sine cure either, so we should come up with a changeset which does not need to analyze userland action code in depth.
Note
GerHobbelt/jison does perform some rough code analysis, but any mistakes made there are harmless as those impact kernel performance optimizations only: an incorrect code analysis only makes jison keep the slower (and safer) code.
Suggested solution v2.0
Which results in this set of behaviours on reduce:
-
analyze which rules have user-defined action code chunks (we already know that as this knowledge is required for the generation of
performAction()by jison) -
inside
performActioncache$0locally. (WARNING: we now will assume a environment where referencing$0is not necessarily equivalent to$$! See next steps.) -
set
$$ = undefinedfor all rules. User-defined action, iff it does any$$assignment at all, will overwrite this value and we'll be fine, producing an always-deterministic$$result this way, including in those occasions where the userland action code is subtly flawed, e.g.:
slist : A B
{
if ($A) { $$ = $B }
// whoops! no assignment of `$$` when not($A):
// coder must have forgotten to add the 'else' clause!
}
;
- use
$$ = $1as fill-in action code for those rules which don't have any action code and are non-empty, i.e. have a$1to speak of.
Because we've set $$ = undefined for all rules at the start, epsilon rules are already covered.
Notes
-
referencing
$0is not equivalent to referencing$$following this scenario:$$would produceundefineduntil the action code assigns another value to it, while$0will carry the old value for any use as suggested by the bison documentation quoted above. -
A simple analyzer can check if any user action code references
$0and if none do, we can forego the caching of$0locally, thus producing faster code. -
Smart Alecs who assign a value to
$0instead of$$will lose that value as it will be discarded by the time the given action code chunk has finished executing, since$0is now a local variable (with a correspondingly shorter lifetime).