parser
parser copied to clipboard
Heredoc delimiter considered part of the string
A heredoc whose delimiter has a sequence of characters that look like interpolation, seems to get confused and embed part of the delimiter into the string body.
Given this code
# cat f.rb
p <<"A#{b}C"
str
A#{b}C
When I run it in Ruby:
# ruby f.rb
"str\n"
So I would expect this ast:
# ruby-parse f.rb
s(:send, nil, :p,
s(:str, "str\n"))
But, I get this ast:
# ruby-parse f.rb
(send nil :p
(dstr
(str "str\n")
(str "A")
(begin
(send nil :b))))
sigh
By the way, I need a co-maintainer for Parser--not really for any semantic changes, for now at least, but just for daily stuff like bumping the syntax warning (but checking that all the commits from parse.y are reflected in parser first). Would you volunteer?
Back on topic: this is actually really tricky to fix without nontrivial performance degradation...
@JoshCheek This is going to be an extremely nasty rewrite of the heredoc lexing and I've no time for it right now. This bug is only ever hit when using #{ inside an interpolated heredoc delimiter, so I guess it's not going to hurt much meanwhile.
The reason why this is extremely nasty is this:
p <<"A#{b}C"
#{
<<"A#{b}C"
A#{b}C
}
str
A#{b}C
which is valid Ruby and returns "\nstr\n". Fuck this language, for real.
@alexdowad Maybe you have an idea of how to make it work decently. The only option I see is comparing the entire line with the heredoc delimiter before proceeding to lex it like usual.
I'll try to have a look at it today.
By the way, I need a co-maintainer for Parser--not really for any semantic changes, for now at least, but just for daily stuff like bumping the syntax warning (but checking that all the commits from parse.y are reflected in parser first). Would you volunteer?
If you need temporal consistency, I'm probably not the right person. Otherwise I'd be happy to contribute. You may need to spell some things out the first time, eg I'm not confident about what "the syntax warning" means (maybe this?)
If you need temporal consistency, I'm probably not the right person. Otherwise I'd be happy to contribute.
Ack.
You may need to spell some things out the first time, eg I'm not confident about what "the syntax warning" means (maybe this?)
Indeed. It's something we have to do because of how MRI is maintained...
As a user of this library I'd be fine to not see this edge case supported. And instead get an dedicated exception telling the user of the library: "Ruby is exceptionally ugly here, we do not want to support this for sanity".
This would avoid pointlessly (IMO) spending limited developer time on this issue.
Parser already does subset ruby semantics in some areas, as long it does not superset them (which is currently the case for this instance) I'm fine.
Ruby is exceptionally ugly here, we do not want to support this for sanity
Nah, this is just lazy and poor engineering.