crystal
crystal copied to clipboard
`Enumerable` method to find the first truthy block result and return that result
Feature Request
Is your feature request related to a problem? Please describe clearly and concisely what is it.
I have a frequent need for something that is similar to Enumerable#find, but rather than returning the element for which the block is truthy, it returns that truthy block result itself.
This comes up when I have an array of objects that may or may not have a property set. The most recent need for this came up when implementing tool use for LLM APIs. The API response has an array of choices, each with a message, and those messages may or may not have a tool_calls array.
record Response, choices : Array(Choice)
record Choice, message : Message
record Message, tool_calls : Array(ToolCall)?
record ToolCall
response = Response.new([Choice.new(Message.new([ToolCall.new]))])
Describe the feature you would like, optionally illustrated by examples, and how it will solve the above problem.
What I've been using in my code to make this more expressive is a monkeypatch on Enumerable called find_first:
if tool_calls = response.choices.find_first &.message.tool_calls
# find the tools for these calls, execute them, and send the results back to the API
end
Describe considered alternative solutions, and the reasons why you have not proposed them as a solution here.
Currently, as far as I can tell, at least, the only ways to do this involve either executing that same criteria again:
if choice_with_tool_call = response.choices.find &.message.tool_calls
# the method chain for choice.message.tool_calls has to be repeated
tool_calls = choice_with_tool_call
.message
.tool_calls
end
if tool_calls
# find the tools for these calls, execute them, and send the results back to the API
end
Or this format with the block expanded
# or this
response.choices.each do |choice|
if tool_calls = choice.message.tool_calls
# find the tools for these calls, execute them, and send the results back to the API
break
end
end
These are both a lot more verbose, especially when there are multiple layers of these arrays. Having a method that returns the truthy value lets us use the shorthand block syntax and have a single expression that does all of these things smoothly.
Does it break backward compatibility, if yes then what's the migration path?
Nope. Purely additive.
To be clear, I'm not attached to the find_first name. It's the name I've been using only because I haven't come up with anything better.
response = Response.new([Choice.new(Message.new(nil)), Choice.new(Message.new([ToolCall.new, ToolCall.new]))])
if tool_calls = response.choices.each { |c| tc = c.message.tool_calls; break tc if tc }
pp tool_calls # => [ToolCall(), ToolCall()]
end
This is best I could come up with without introducing something new. As reference only.
It is still slightly clunky, but at least it doesn't involve break:
[1, 2, 3].each.map { |z| evaluate(z) }.find &.itself
If it is fine to find the first not-nil-value, then
[1, 2, 3].each.compact_map { |z| evaluate(z) }.first
is even better
Ruby doesn't have it, but in Elixir it is called Enum.find_value/3
Sounds good to me. I'd prefer the name find_value over find_first.
In this comment, @Sija brought up an interesting point I hadn't considered. I normally use this for nil checks (and even then only use shorthand blocks), but are there any use cases for false triggering an early return?
Part of me wants to push back against it because I don't know of any other Enumerable methods where that is the case, but also I don't have any stake at all in how false is treated, so I thought I'd bring it here for discussion.
Elixir's find_value skips over both falsy values: nil and false, just FTR.
I would argue this method should have the same semantics as find wrt nil vs falsey.
I came here to share this observation, but @yxhuvud already did:
Yes, this is the same as:
[1, 2, 3].each.map { |x| evaluate(x) }.find &.itself
~~But the issue is that map materializes the whole array! If it didn't, then we could push back against adding this method.~~
And it is made even more clunky by the fact that .each. is necessary there. If you forget to add it then map would materialize the whole array.
Also fun fact:
These methods map and to_a are exact aliases of each other
- https://github.com/crystal-lang/crystal/blob/4aac6f2ee3494a593d79eeea389c77037e025adc/src/enumerable.cr#L2007
- https://github.com/crystal-lang/crystal/blob/4aac6f2ee3494a593d79eeea389c77037e025adc/src/enumerable.cr#L1016
(which makes me wish that map was lazy even more)
Not only is the method chain clunkier, but since it also involves a heap allocation (Iterators have state), it's also slower:
With tool calls
find_value 2.09M (477.64ns) (± 1.45%) 0.0B/op fastest
each.compact_map(&).first? 134.17k ( 7.45µs) (± 0.85%) 31.2kB/op 15.60× slower
each.map(&).find(&.itself) 128.57k ( 7.78µs) (± 1.45%) 31.2kB/op 16.28× slower
Without tool calls
find_value 1.57M (638.94ns) (± 5.47%) 0.0B/op fastest
each.compact_map(&).first? 132.08k ( 7.57µs) (± 1.62%) 31.2kB/op 11.85× slower
each.map(&).find(&.itself) 111.00k ( 9.01µs) (± 0.92%) 31.2kB/op 14.10× slower
Benchmark code
require "benchmark"
response_with_tool_calls = Response.new([Choice.new(Message.new([ToolCall.new]))])
response_without_tool_calls = Response.new([Choice.new(Message.new(nil))])
# Need to run the block multiple times because each iteration for find_value
# takes less time than the monotonic clock can measure
iterations = 1_000
[response_with_tool_calls, response_without_tool_calls].each do |response|
tool_calls = response.choices.find_value &.message.tool_calls
puts "#{tool_calls ? "With" : "Without"} tool calls"
Benchmark.ips do |x|
x.report "find_value" { iterations.times { tool_calls = response.choices.find_value &.message.tool_calls } }
x.report "each.compact_map(&).first?" { iterations.times { tool_calls = response.choices.each.compact_map(&.message.tool_calls).first? } }
x.report "each.map(&).find(&.itself)" { iterations.times { tool_calls = response.choices.each.map(&.message.tool_calls).find &.itself } }
end
puts
# Side effect so LLVM won't optimize out the benchmark
pp tool_calls unless tool_calls
end
record Response, choices : Array(Choice)
record Choice, message : Message
record Message, tool_calls : Array(ToolCall)?
record ToolCall
module Enumerable(T)
# Same implementation as https://github.com/crystal-lang/crystal/pull/14893
def find_value(if_none = nil, & : T ->)
each do |i|
if result = yield i
return result
end
end
if_none
end
end