parser icon indicating copy to clipboard operation
parser copied to clipboard

Heredoc delimiter considered part of the string

Open JoshCheek opened this issue 8 years ago • 11 comments

A heredoc whose delimiter has a sequence of characters that look like interpolation, seems to get confused and embed part of the delimiter into the string body.


Given this code

# cat f.rb
p <<"A#{b}C"
str
A#{b}C

When I run it in Ruby:

# ruby f.rb
"str\n"

So I would expect this ast:

# ruby-parse f.rb
s(:send, nil, :p,
  s(:str, "str\n"))

But, I get this ast:

# ruby-parse f.rb
(send nil :p
  (dstr
    (str "str\n")
    (str "A")
    (begin
      (send nil :b))))

JoshCheek avatar Dec 01 '16 20:12 JoshCheek

sigh

whitequark avatar Dec 02 '16 02:12 whitequark

By the way, I need a co-maintainer for Parser--not really for any semantic changes, for now at least, but just for daily stuff like bumping the syntax warning (but checking that all the commits from parse.y are reflected in parser first). Would you volunteer?

whitequark avatar Dec 02 '16 02:12 whitequark

Back on topic: this is actually really tricky to fix without nontrivial performance degradation...

whitequark avatar Dec 02 '16 02:12 whitequark

@JoshCheek This is going to be an extremely nasty rewrite of the heredoc lexing and I've no time for it right now. This bug is only ever hit when using #{ inside an interpolated heredoc delimiter, so I guess it's not going to hurt much meanwhile.

whitequark avatar Dec 02 '16 02:12 whitequark

The reason why this is extremely nasty is this:

p <<"A#{b}C"
#{
  <<"A#{b}C"
A#{b}C
}
str
A#{b}C

which is valid Ruby and returns "\nstr\n". Fuck this language, for real.

whitequark avatar Dec 02 '16 02:12 whitequark

@alexdowad Maybe you have an idea of how to make it work decently. The only option I see is comparing the entire line with the heredoc delimiter before proceeding to lex it like usual.

whitequark avatar Dec 02 '16 02:12 whitequark

I'll try to have a look at it today.

alexdowad avatar Dec 02 '16 03:12 alexdowad

By the way, I need a co-maintainer for Parser--not really for any semantic changes, for now at least, but just for daily stuff like bumping the syntax warning (but checking that all the commits from parse.y are reflected in parser first). Would you volunteer?

If you need temporal consistency, I'm probably not the right person. Otherwise I'd be happy to contribute. You may need to spell some things out the first time, eg I'm not confident about what "the syntax warning" means (maybe this?)

JoshCheek avatar Dec 02 '16 16:12 JoshCheek

If you need temporal consistency, I'm probably not the right person. Otherwise I'd be happy to contribute.

Ack.

You may need to spell some things out the first time, eg I'm not confident about what "the syntax warning" means (maybe this?)

Indeed. It's something we have to do because of how MRI is maintained...

whitequark avatar Dec 03 '16 08:12 whitequark

As a user of this library I'd be fine to not see this edge case supported. And instead get an dedicated exception telling the user of the library: "Ruby is exceptionally ugly here, we do not want to support this for sanity".

This would avoid pointlessly (IMO) spending limited developer time on this issue.

Parser already does subset ruby semantics in some areas, as long it does not superset them (which is currently the case for this instance) I'm fine.

mbj avatar Feb 23 '18 11:02 mbj

Ruby is exceptionally ugly here, we do not want to support this for sanity

Nah, this is just lazy and poor engineering.

whitequark avatar Feb 23 '18 16:02 whitequark