rspec-expectations icon indicating copy to clipboard operation
rspec-expectations copied to clipboard

New matcher idea: `match_pattern` for Ruby's pattern-matching

Open r7kamura opened this issue 2 years ago • 4 comments

Summary

Pattern-matching has been available since Ruby 3.0. How about providing this as a RSpec matcher?

Examples

Basic example

# Pass
expect([1, 2, 3]).to match_pattern([Integer, Integer, Integer])

# Fail
expect([1, 2, 3]).to match_pattern([Integer, Integer, String])

More realistic example

node = Nokogiri::HTML5.parse(response.body).at('.foo')
expect(node).to match_pattern(name: 'a', href: 'http://example.com')

This is done using nokogiri's pattern-matching feature:

  • https://github.com/sparklemotion/nokogiri/discussions/2360
  • https://github.com/sparklemotion/nokogiri/pull/2523

Background

As a background, I am usually involved in Rails app development, and since recent Rails makes it easier to do pattern matching using response.parsed_body, so I thought it would be nice if RSpec had the ability to support this.

  • https://github.com/rails/rails/pull/49003

Implementation

I have come up with two implementation ideas and I think either is fine, but the latter seems closer to what RSpec is aiming for, so I decided to propose this one for the time being, even if it is a bit magical.

The former has its advantages here because it is simple to implement and easy to understand. Please let me know what you think.

Using block

First in thinking about this, I referred to assert_pattern in minitest:

  • https://github.com/minitest/minitest/pull/936

Minitest's pattern-matching feature provides an interface that encloses the entire expression in a block, as shown below:

assert_pattern { [1, 2, 3] => [Integer, Integer, Integer] }

A similar implementation could be done in RSpec. In short, we can have a match_pattern that does the following:

expect { [1, 2, 3] => [Integer, Integer, Integer] }.not_to raise_error(NoMatchingPatternError)
expect { [1, 2, 3] => [Integer, Integer, Integer] }.to match_pattern

Using instance_eval

In RSpec, it is common to write expect(actual).to ... , so it would be nice if we could provide pattern-matching in a way that follows this convention.

I thought of a way to achieve this by using inspect and instance_eval to carry around the given expected pattern as a value.

expect([1, 2, 3]).to match_pattern([Integer, Integer, Integer])

In this implementation, the following code will be evaluated in the above code:

[1, 2, 3] in [Integer, Integer, Integer]

r7kamura avatar Nov 11 '23 08:11 r7kamura

@zverok what do you think?

pirj avatar Nov 11 '23 14:11 pirj

So... Just thinking out loud!

My design space analysis is as follows:

  1. I personally would like any new matcher to be fully on the right side of to (i.e., allowing to always write expect(subject).to matcher or expect { subject }.to matcher, without any change to the left side — also allowing implicit subjects); which, for me, excludes the first option.
  2. I personally believe I would be confused if the new feature would be pretending to accept a Ruby's pattern-matching, but actually wouldn't, i.e., accepted some subset of Ruby that "looks like" some patterns but wouldn't be handled with the same syntactical rules actually. I believe it is a great source of possible confusion. There are many valid options that people might want, and which are valid patterns yet invalid in other contexts (and will be therefore invalid in this "imitation" syntax), like find patterns [*, {id: 3}, *], variable pinning: {id:, ... {linked_id: ^id}}, or even just (if "we have a full pattern matching here! it is more expressive!") trying to use {x:, y:} instead of hash_including(:x, :y). Which, basically, makes the second option undesirable.
  3. An aside note: Ideally, and for realistic large data structures (like JSON responses), it would be cool to provide a detailed report of "which part exactly didn't match"

So, what options do I see viable for this matcher?

1. A block on the right side that re-accepts the subject and just allows to write the check:

expect([1, 2, 3]).to match_pattern { _1 in [Integer, String, Integer] }
# expected to match pattern [Integer, String, Integer], but didn't match

This is currently possible in almost the same form this way:

expect([1, 2, 3]).to satisfy { _1 in [Integer, String, Integer] }
# expected [1, 2, 3] to satisfy expression `_1 in [Integer, String, Integer]`

(I believe the expression is extracted by just reading the file source, and the same can be done by match_pattern, just cleaning up _1 in part)

2. The same block, yet also (or only) expecting you to write => which then analyzes the error:

expect([1, 2, 3]).to match_pattern { _1 => [Integer, String, Integer] }
# expected to match pattern [Integer, String, Integer], but didn't match:
#  String === 2 does not return true

Here, we are doing the same as above yet also catching the NoMatchingPattern error and extract the relevant part of text from it.

(That's how it will fail in this case... I don't think it is ideal, and probably there is some room for improvement in Ruby core with good proposals for better structuring "what haven't matched" info, and probably even providing it in a structured form in the exception object!)

Obviously, both forms would be less appealing for those who HATE numbered block params (and those people are plenty!), so they'll need to write

expect([1, 2, 3]).to match_pattern { |val| val => [Integer, String, Integer] }

With a bit of trickery (providing proper evaluation context), we can provide some nice name available inside a block instead (but this "trickery" might lead to undesirable effects either, and deciding on the name is not easy):

expect([1, 2, 3]).to match_pattern { value => [Integer, String, Integer] }

3. The third idea is much more unholy and also currently impossible, but looks more appealing as a matcher syntax:

expect([1, 2, 3]).to match_pattern => [Integer, String, Integer]

In my imagination, it would work by having match_pattern just provide an object to match (the same as subject or some wrapper), and then adding some "custom handler" to RSpec core, which will catch NoMatchingPattern error, and treat it as "that matcher that added the handler, have failed". This is kinda tricky, but wouldn't work in the current Ruby nevertheless: => has a "void context" (same as, say, return), and can't be put in the expression in this place.

4. The fourth one is a compromise on the above:

expect([1, 2, 3]).to { match_pattern => [Integer, String, Integer] }

This is valid syntax, and while also requires adjusting RSpec core, the adjustment seems more understandable than "adding a global handler" (in this case, the exception needs to be handled in the limited scope of the "block passed to to", not in the scope of the "entire example"). Maybe fiddling a bit around "how it can be implemented" might lead to some insights about adjustments in RSpec core (expect/to) that would be useful... Or maybe not!

Anyway, atm, honestly, no option looks super-persuasive for myself either. Would I need a lot of PM-in-spec (unfortunately, on my dayjob I still stuck with Ruby 2.7 where pattern matching was experimental and we prefer to avoid that unfinished-yet syntax), I would start with just satisfy { _1 in pattern } and after several weeks/months would try to rethink it and see whether it requires some more tinkering.

zverok avatar Nov 11 '23 15:11 zverok

Very thoughtful input, thank you!

Indeed, the instance_eval idea is not a good one where a lot of pattern-matching syntax doesn't actually work properly.

The expect([1, 2, 3]).to match_pattern { |val| val => [Integer, String, Integer] } idea seems the most reasonable of the current proposals for me, although I think the drawback is that it looks a bit clunky. Even from my personal observation, the use of numbered parameters does not seem to be widespread yet, nor does it seem to be that popular, so many people will probably use the normal block argument format.

Regarding the "what haven't matched" info, I was thinking that since pattern matching has variable capturing feature, it would be nice to be able to partially verify it using aggregated failures as follows:

expect(hash).to match_pattern { _1 => { user: user } }
expect(user).to match_pattern { _1 => { name: 'foo' } }

However, blocks create local variable scopes, so we have to do something tricky to make this work well.

Thinking about it this way, I wonder if it is no longer necessary to use any matcher and just do the assignment as usual...🤔

it 'returns something' do
  subject => { user: user }
  user => { name: 'foo' }
end

r7kamura avatar Nov 11 '23 16:11 r7kamura

By the way, I heard that it might be available in Ruby 3.4 instead of _1.

  • https://bugs.ruby-lang.org/issues/18980

If that happens, we may be able to write the following:

expect([1, 2, 3]).to match_pattern { it => [Integer, String, Integer] }

But interestingly (unfortunately), the naming conflicts with RSpec's #it, which is very, very confusing...

r7kamura avatar Dec 08 '23 08:12 r7kamura

Migrated to the monorepo rspec/rspec#149

JonRowe avatar Nov 30 '24 12:11 JonRowe