Propagating `@compare_by_identity` and `@block` in `Hash`'s methods
In Ruby, instance methods of Hash that create a new Hash may or may not copy the #default, #default_proc, and #compare_by_identity? properties to that new Hash. Likewise, the same methods in Crystal may copy @compare_by_identity and @block. The two types are not the same:
| Method | Ruby | Crystal |
|---|---|---|
#dup |
✅ Yes | ✅ Yes |
#clone |
✅ Yes | ✅ Yes |
#select |
✅ Yes | ❌ No |
#slice |
✅ Yes | ❌ No (blockless #select) |
#reject |
✅ Yes | ❌ No |
#except |
✅ Yes | ✅ Yes (blockless #reject) |
#compact |
✅ Yes in 3.3 | ❌ No |
#merge |
✅ Yes | ❌ No |
#transform_keys |
❌ No | ❌ No |
#transform_values |
✅ Yes | ❌ No |
#invert |
❌ No | ❌ No |
The rule of thumb in Ruby seems to be that if the method retains some subset of the receiver's keys, then these properties are also copied. Thus #transform_keys and #invert are the odd ones out as in general they do not retain the receiver's keys at all. I think Crystal should follow suit because doing so avoids correctness and performance issues in these methods (see https://github.com/crystal-lang/crystal/issues/8329#issuecomment-564767738):
x = Hash(String, Int32).new.compare_by_identity
x["foo"] = 1
x["fo" + "o"] = 2
x # => {"foo" => 1, "foo" => 2}
y = x.select { true } # => {"foo" => 2}
y.compare_by_identity? # => false
What is even the correct behavior with transform_values? It seems that ruby doesn't actually copy over the default value for transform_values (edit: or for the other ones either, though I'd argue they should stay if the types don't change, like in select and such)
irb(main):001:0> h1 = Hash.new(10)
=> {}
irb(main):002:0> h1[0]
=> 10
irb(main):003:0> h2 = h1.transform_values {|v| v + 1}
=> {}
irb(main):004:0> h2[0]
=> nil
irb(main):005:0> RUBY_VERSION
=> "3.2.2"
IMO that seems to make sense to me, since it seems like it's easily ambiguous if transform_values should apply to the default value.
h = Hash(Int32, Int32).new(10)
h.transform_values {|v| v + 1}
h[0] #=> 10 || 11?
But transform_values can also change the value type, which could be convertible, though might be nonsensical.
h = Hash(Int32, Int32).new(10)
h.transform_values {|v| v.to_s}
h[0] #=> "10"
I've also been running into issues with typing the default value if merging a hash with a different value type, but I think that's fixable, since the new value type is a superset of the old default value, so it's just a matter of convincing the compiler that it all works out.
I just ran what ruby does regarding default values for these and this is what I got
| Method | Ruby | Crystal |
|---|---|---|
| #dup | ✅ Yes | ✅ Yes |
| #clone | ✅ Yes | ✅ Yes |
| #select | ❌ No | ❌ No |
| #slice | ❌ No | ❌ No (blockless #select) |
| #reject | ❌ No | ❌ No |
| #except | ❌ No | ❌ No (blockless #reject) |
| #compact | ✅ Yes in 3.3 | ❌ No |
| #merge | ✅ Yes | ❌ No |
| #transform_keys | ❌ No | ❌ No |
| #transform_values | ❌ No | ❌ No |
| #invert | ❌ No | ❌ No |
This actually affects Set too, e.g. Set#| should more or less behave like Hash#merge in terms of @compare_by_identity. (Set doesn't have a default proc.)
Could you please clarify what is the conclusion of the discussion and expected behaviour in Crystal? Are the methods mentioned above supposed to behave similarly to the Ruby counterparts?