crystal icon indicating copy to clipboard operation
crystal copied to clipboard

`StringPool` as a collection

Open HertzDevil opened this issue 2 years ago • 3 comments

StringPool is often used to reduce string memory consumption when one has a byte buffer to begin with, but conceptually it is quite similar to a Set(String). That means operations on collections like #count and #join should also make sense on a StringPool. So I wonder if it would be a good idea for StringPool to include Enumerable:

class StringPool
  include Enumerable(String)

  def each(& : String ->) : Nil
    @capacity.times do |i|
      unless @hashes[i] == 0
        yield @values[i]
      end
    end
  end

  # array-like literal support
  def <<(str : String | Bytes | IO::Memory) : self
    get(str)
    self
  end
end

StringPool{"foo", "bar", "baz"}.count(&.starts_with?("ba")) # => 2

HertzDevil avatar Apr 18 '24 14:04 HertzDevil

What would be a use case for this though? I don't think it's really necessary for the purpose of deduplication.

straight-shoota avatar Apr 18 '24 14:04 straight-shoota

With #get? it will be useful to create a StringPool with a set of known keys, and doing so in idiomatic crystal sounds good.

known_keys = StringPool{"foo", "bar"}
known_keys << "baz"

ysbaddaden avatar Apr 19 '24 08:04 ysbaddaden

I don't think that depends on #get?; it works directly with #get (otherwise nothing gets inserted), and doesn't even require StringPool to be an Enumerable, as long as a compatible #<< exists

HertzDevil avatar Apr 20 '24 15:04 HertzDevil