crystal icon indicating copy to clipboard operation
crystal copied to clipboard

Propagating `@compare_by_identity` and `@block` in `Hash`'s methods

Open HertzDevil opened this issue 2 years ago • 4 comments

In Ruby, instance methods of Hash that create a new Hash may or may not copy the #default, #default_proc, and #compare_by_identity? properties to that new Hash. Likewise, the same methods in Crystal may copy @compare_by_identity and @block. The two types are not the same:

Method Ruby Crystal
#dup ✅ Yes ✅ Yes
#clone ✅ Yes ✅ Yes
#select ✅ Yes ❌ No
#slice ✅ Yes ❌ No (blockless #select)
#reject ✅ Yes ❌ No
#except ✅ Yes ✅ Yes (blockless #reject)
#compact ✅ Yes in 3.3 ❌ No
#merge ✅ Yes ❌ No
#transform_keys ❌ No ❌ No
#transform_values ✅ Yes ❌ No
#invert ❌ No ❌ No

The rule of thumb in Ruby seems to be that if the method retains some subset of the receiver's keys, then these properties are also copied. Thus #transform_keys and #invert are the odd ones out as in general they do not retain the receiver's keys at all. I think Crystal should follow suit because doing so avoids correctness and performance issues in these methods (see https://github.com/crystal-lang/crystal/issues/8329#issuecomment-564767738):

x = Hash(String, Int32).new.compare_by_identity
x["foo"] = 1
x["fo" + "o"] = 2
x # => {"foo" => 1, "foo" => 2}

y = x.select { true }  # => {"foo" => 2}
y.compare_by_identity? # => false

HertzDevil avatar Jul 01 '23 15:07 HertzDevil

What is even the correct behavior with transform_values? It seems that ruby doesn't actually copy over the default value for transform_values (edit: or for the other ones either, though I'd argue they should stay if the types don't change, like in select and such)

irb(main):001:0> h1 = Hash.new(10)
=> {}
irb(main):002:0> h1[0]
=> 10
irb(main):003:0> h2 = h1.transform_values {|v| v + 1}
=> {}
irb(main):004:0> h2[0]
=> nil
irb(main):005:0> RUBY_VERSION
=> "3.2.2"

IMO that seems to make sense to me, since it seems like it's easily ambiguous if transform_values should apply to the default value.

h = Hash(Int32, Int32).new(10)
h.transform_values {|v| v + 1}
h[0] #=> 10 || 11?

But transform_values can also change the value type, which could be convertible, though might be nonsensical.

h = Hash(Int32, Int32).new(10)
h.transform_values {|v| v.to_s}
h[0] #=> "10"

I've also been running into issues with typing the default value if merging a hash with a different value type, but I think that's fixable, since the new value type is a superset of the old default value, so it's just a matter of convincing the compiler that it all works out.

baseballlover723 avatar Jul 02 '23 00:07 baseballlover723

I just ran what ruby does regarding default values for these and this is what I got

Method Ruby Crystal
#dup ✅ Yes ✅ Yes
#clone ✅ Yes ✅ Yes
#select ❌ No ❌ No
#slice ❌ No ❌ No (blockless #select)
#reject ❌ No ❌ No
#except ❌ No ❌ No (blockless #reject)
#compact ✅ Yes in 3.3 ❌ No
#merge ✅ Yes ❌ No
#transform_keys ❌ No ❌ No
#transform_values ❌ No ❌ No
#invert ❌ No ❌ No

baseballlover723 avatar Jul 02 '23 00:07 baseballlover723

This actually affects Set too, e.g. Set#| should more or less behave like Hash#merge in terms of @compare_by_identity. (Set doesn't have a default proc.)

HertzDevil avatar Jul 04 '23 08:07 HertzDevil

Could you please clarify what is the conclusion of the discussion and expected behaviour in Crystal? Are the methods mentioned above supposed to behave similarly to the Ruby counterparts?

andrykonchin avatar Oct 27 '25 15:10 andrykonchin