crystal icon indicating copy to clipboard operation
crystal copied to clipboard

Add missing StringScanner methods

Open postmodern opened this issue 4 years ago • 9 comments

Crystal's StringScanner does not support every method which Ruby's StringScanner does.

  • #beginning_of_line?
  • #get_byte
  • #getch
  • #match
  • #match?
  • #matched?
  • #matched_size
  • #exist?
  • #pos
  • #rest?
  • #rest_size
  • #pre_match
  • #post_match
  • #<<
  • #concat
  • #unscan

postmodern avatar Sep 29 '21 22:09 postmodern

I particularly need #getch to step through the string if #scan does not match the current position.

postmodern avatar Sep 29 '21 22:09 postmodern

I don't know what you are doing, but << would be impossible to implement because strings are immutable.

asterite avatar Sep 29 '21 23:09 asterite

Also, could you describe what you are doing?

asterite avatar Sep 29 '21 23:09 asterite

I am porting this code to Crystal which walks a string attempt to match a series of regexes at each index, and if one of them matches wraps the matching string in ANSI styling: https://github.com/postmodern/hexdump.rb/blob/ead8dc6b45b281423ad7d4ea86d88f533ea0fbb4/lib/hexdump/theme/rule.rb#L129-L146

I even tried rewriting the code to only use String#match(regex,pos), but match(regex,pos) behaves differently from StringScanner#scan (will only match the regex at the current position, where as String#match will incrementally walk the string until it finds a match).

Yeah mutable methods like #<<, #concat, or even #string= are questionable, but I need #getch.

postmodern avatar Sep 29 '21 23:09 postmodern

Have you tried Regex#match with Regex::Options::ANCHORED?

getch should be equivalent to scan(/./), though a Crystal implementation is certainly faster than that.

HertzDevil avatar Sep 30 '21 00:09 HertzDevil

@HertzDevil ah clever and now the specs pass. Thank you!

I still think StringScanner should support more of Ruby's StringScanner to make porting to Crystal easier. getch, get_byte, match* methods seem like good candidates.

postmodern avatar Sep 30 '21 00:09 postmodern

I agree, though I'll probably name them read_byte, read_char, etc., to match IO

asterite avatar Sep 30 '21 11:09 asterite

@asterite Hi! I also need those methods when porting Ruby code to Crystal, so tried implementing #read_byte and #read_char methods. Please take a look at #11785 .

Kanezoh avatar Jan 30 '22 13:01 Kanezoh

With #16455, #getch could be replaced with .scan(1). #read_byte is omitted because it could lead to invalid character indices, but #current_byte is available to return the byte at the head.

jneen avatar Dec 01 '25 00:12 jneen