parslet
parslet copied to clipboard
repeat.as outputs [] for empty string input
Normally when you use as
on repeat
, the matched string will be in the result hash:
str('a').repeat.as(:b).parse('aaaaa')
# => {:b=>"aaaaa"@0}
However when the input string is empty, which repeat
accepts by default, the result will have an empty array instead:
str('a').repeat.as(:b).parse('')
# => {:b=>"[]"}
This inconsistency makes it hard to write transform rules, because "simple" only matches if the input string is non-empty, and "sequence" only matches if the input is empty:
transform = Transform.new {rule(:b => simple(:b)) {b}}
transform.apply str('a').repeat.as(:b).parse('aaaaa')
# => "aaaaa"@0
transform.apply str('a').repeat.as(:b).parse('')
# => {:b=>[]}
transform = Transform.new {rule(:b => sequence(:b)) {b.join}}
transform.apply str('a').repeat.as(:b).parse('aaaaa')
# => {:b=>"aaaaa"@0}
transform.apply str('a').repeat.as(:b).parse('')
# => ""
Of course if the subtree is as simple as {:b => '...'}
or {:b => []}
, another transform rule can normalize them. But if there are multiple keys in the subtree, it would be tedious to write that rule. Is there a reason why the parser shouldn't just output empty string for repeat.as
when the input is empty?
Yes there is. It becomes apparent when you do something like this:
str('a').as(:a).repeat.as(:b).parse('aaaaa')
However, I would consider a second (third/last) argument to repeat
to specify whether an empty match should result in a nil
or in a []
- parslet can't know really without explicit indication. How would you like that?
# Fantasy code ahead:
str('a').repeat(no_match: nil).as(:b).parse('aaaaa')
Just want to say that i also have some rules that i would like to clean up, remove the duplication. +1 as it were, and the suggestion sounds good.
str('a')
is a Parslet::Atoms::Str
, while str('a').as(:a)
is a Parslet::Atoms::Named
. Could repeat
automatically determine its as
output for empty input based on this difference?
If you add multiple layers of Entity, Sequence, ... on top, you wont be able to tell.
I've thought about this and see the opportunity for improvement now. I'll execute your last idea as soon as I get to it.
Is there any plan to implement this? This issue is old but looks like there has been some recent activity on the repo. I agree with everything @hagabaka said above.