irb icon indicating copy to clipboard operation
irb copied to clipboard

Improve command input recognition

Open st0012 opened this issue 1 year ago • 10 comments

Currently, we simply split the input on whitespace and use the first word as the command name to check if it's a command. But this means that assigning a local variable which's name is the same as a command will also be recognized as a command.

For example, in the following case, info is recognized as a command:

irb(main):001> info = 123
`debug` command is only available when IRB is started with binding.irb
=> nil

This commit improves the command input recognition by using more sophisticated regular expressions.

Closes #803

st0012 avatar Dec 07 '23 15:12 st0012

show_source =~ -s is a valid option for show_source command, and also a valid ruby code. info = -1 will be still interpreted as command because it matches COMMAND_WITH_ARGS_AND_FLAGS_REGEXP.

I think info = 1 interpreted as command is consistent, not a bug but one of the possible design. I know it's surprising, so we need some workaround, but should be carefully done.

One idea is to exclude only these assignments

info = a # but accept =~ == ===
info += a # and other operators
info &&= a
info ||= a

Maybe we can check local variable existence just like ExtendCommandBundle.install_extend_commands checks for method existence

show_source == -s # command
show_source =~ -s # command
show_source = /regexp/ # ruby
show_source =~ -s # ruby (because show_source is local variable)
show_source == -s # ruby
info + 1 # command
info ||= 1 # ruby
info + 1 # ruby (because info is local variable)
info 1 # (ruby? 

tompng avatar Dec 07 '23 16:12 tompng

but should be carefully done.

Yeah I agree. But IMO the solution doesn't need to perfectly cover all cases. It only has to be slightly better than the current behaviour.

Also, sooner or later we need to do some kind of regexp match on input in order to make command options easier to support, instead of writing them in transform_args separately. With that in mind, if we can have a pure regexp solution on the problem, it'll greatly reduce the complexity.

To be clear, I'm not saying this PR is THE solution. I'd like to see how many cases it can support as well, so more cases are welcome 🙏

Maybe we can check local variable existence

I worry that this may make the behaviour harder to understand as the same input, like info, could behave differently depending on the context, which may not always obvious to users (like info's assignment happened 20 lines above the breakpoint).

st0012 avatar Dec 07 '23 18:12 st0012

Capturing flags will regexp will enable refactoring like this: https://github.com/ruby/irb/pull/809

st0012 avatar Dec 09 '23 15:12 st0012

I like #809 idea :+1:

Although I have some concern about regexp including flag and args. It restricts not only command's implementation but also user's input. Input like show_source class method set_some_number 12 34 will be rejected. show_source Klass method --s does not show Couldn't locate a definition for ... but raises undefined local variable or method `s'. The message wrongly indicates that command does not exist. I think command should be recognized in another way, and using optparser/regexp or not can be controlled in each command implementation.

tompng avatar Dec 09 '23 17:12 tompng

I agree local_variable defined or not will make implementation complicated and harder to understand the behavior. I think we can use a simple regexp to handle it.

I made a list of command-like ruby expressions. (maybe there are some missing pattern, but it should be a rare case)

Possible command-like style local variable assignment:

info = expr
info += expr # `*=` `&&=` `||=` and other
info ,other = expr

Possible command-like style local variable usage:

info
info # comment
info => pattern
info in pattern
info + expr # (+ - * ** / % ^ & | < <= <=> >= > << >> == =~ != !~ === && || and or)
info . method
info &. method
info ... range_end
info :: method
info :: Const
info [key]
info [key] = value
info [key] += value
info if expr
info ? expr : expr
info ; expr
info \ # (continue to next line)

Most of them are weird code, not realistic. I think rejecting only these patterns as command is enough.

info = expr
info += expr # *= &&= ||= etc.
info + expr # - * ** / % etc.

show_source =~ -s will be rejected with this rule but supporting show_source self.=~ might be enough.

tompng avatar Dec 09 '23 17:12 tompng

I think command should be recognized in another way

I personally expect us to inevitably come back to regexp for this, unless we're writing a custom parser for IRB commands, which is an overkill IMO.

and using optparser/regexp or not can be controlled in each command implementation.

I think to achieve good abstraction, command flags and arg(s) should be supplied to the command separately. And for consistency & ease of maintenance, I'd like to make command implementation only use optparser for flags in the beginning.

I made a list of command-like ruby expressions.

Thanks for listing them. I've updated the tests are it looks like these are newly supported by this PR:

info = expr
info += expr # `*=` `&&=` `||=` and other
info ,other = expr
info = expr
info += expr # *= &&= ||= etc.
info + expr # - * ** / % etc.
info if expr
info ? expr : expr
info ; expr
info # comment

st0012 avatar Dec 10 '23 12:12 st0012

How about something like this? It can be implemented with regexp.

if input.match?(/\A#{command_name_regexp}( |\z)/)
  # Maybe command
  if input.match?(/\A#{command_name_regexp} #{not_a_command_operators_regexp} /)
    # Not a command. `info = `, `info + `, `info *= ` will match. ruby code
  elsif input.match?(COMMAND_REGEXP)
    # Valid command
  else
    # Command with invalid option. not a ruby code.
    # I think warning message for this is important.
  end
end

tompng avatar Dec 10 '23 14:12 tompng

If you don't mind, I want to take a pause here and first make sure we're trying to find a solution to the same problem.

The currently implementation goes through these examination:

  1. Check the input matches either of these forms:
    • Simple command
    • Command with args
    • Command with flags
    • Command with args and flags
    • If doesn't match any of them, it's treated as Ruby code
    • If matches one of them, we check if the captured command name (or alias) actually is a registered command
    • If the command doesn't exist, then it's treated as Ruby code
    • If the command exists, then it's treated as command

If I understand your concern correctly, it's about we only treat input as either command or Ruby code, which will miss the opportunity to warn users about incorrect forms of commands? Or it's about one of the steps I listed above?

st0012 avatar Dec 10 '23 20:12 st0012

it's about we only treat input as either command or Ruby code, which will miss the opportunity to warn users about incorrect forms of commands?

Yes, that's what I'm concerned about.

For example, when Foo#bar method exists,

irb(main):001> show_source Foo bar
Error: Couldn't locate a definition for Foo bar

This case, user can know that command exists and the usage is wrong.

irb(main):001> show_source Foo bar
(irb):1:in `<main>': undefined local variable or method `bar' for main:Object (NameError)

But in this case, user does not know what is wrong. maybe usage, maybe command name, or something else. I think this is a kind of degradation. We already (maybe unintentionally) have a sort of warning for wrong command form, but this pull request drops it.

tompng avatar Dec 11 '23 14:12 tompng

I've made a change so that when show_source Foo bar is evaluated and caused an error, a warning would be printed:

# errors
        ... 2 levels...
The input `show_source Foo bar` was recognised as a Ruby expression, but it matched the name of the `show_source` command.
If you intended to run it as a command, please check if the syntax is correct.
irb(main):002> 

The same warning is not printed when there's no exception raised to avoid noise.

st0012 avatar Dec 12 '23 17:12 st0012