crystal icon indicating copy to clipboard operation
crystal copied to clipboard

Allow array destructuring to produce nil values

Open watzon opened this issue 6 years ago • 16 comments

Didn't know how best to phrase this. Currently if you want to destructure an array you can do so like this:

arr = [1, 2, 3]
one, two, three = arr

and it works fine. However, a problem is posed when you want to destructure an array that might be empty, or might have less values than you're trying to assign. This happens a lot with String.split. For instance:

time = "1/2/30"
days, hours, minutes = time.split('/') # This will work

time = "1/2"
days, hours, minutes = time.split('/') # This produces an Overflow Error

Maybe there's something I'm missing?

watzon avatar Nov 04 '19 22:11 watzon

Duplicate of https://github.com/crystal-lang/crystal/issues/2617

asterite avatar Nov 04 '19 23:11 asterite

@watzon a minor correction, it produces an Index out of bounds (IndexError) and not an Overflow Error

bcardiff avatar Nov 05 '19 00:11 bcardiff

We can reopen/discuss in the original issue if neccesary.

RX14 avatar Nov 13 '19 10:11 RX14

I propose this gets re-opened. The original issue got a little off topic with workaround suggestions.

The core issue brought up here is that split cannot be reliably used with multi-assign. I just ran into this myself and trying to determine the cause of the error was difficult.

As it stands, x, y = str.split(“ “) may or may not cause a runtime error.

mgomes avatar Feb 03 '21 00:02 mgomes

I agree. This is something I deal with all the time and it would be great to have a solution.

watzon avatar Feb 03 '21 04:02 watzon

ary = value.split x, y = ary[0], ary[1]?

So this is already possible, just not a one liner. Like most things :-)

asterite avatar Feb 03 '21 10:02 asterite

The one-liner for a split between two components is x, _, y = str.partition(" "). It doesn't even need to allocate an array.

That doesn't work for more than two components, though. Maybe that could be improved. But I think the huge problem with that is that all values become nilable. So you may get a convenient destructuring line, but then you have to deal with nilable types afterwards. And that should be avoided. Especially considering there are means to ensure the arity matches.

straight-shoota avatar Feb 03 '21 11:02 straight-shoota

I think the huge problem with that is that all values become nilable. So you may get a convenient destructuring line, but then you have to deal with nilable types afterwards

That's the whole point of this issue. That's the feature, not the problem.

asterite avatar Feb 03 '21 11:02 asterite

Yes, it's a feature when you can't be sure of the number of elements. But when you've validated that, it's a problem.

It would be really annoying if the destructured variables would be nilable in this example:

time = "1/2/30"
time_components = time.split('/')
raise "Invalid format" unless time_components.size == 3
days, hours, minutes = time_components

straight-shoota avatar Feb 03 '21 11:02 straight-shoota

Maybe I didn't make myself clear, but the idea is not to change how days, hours, minutes = time_components work but, as I proposed in #2617 to be able to translate some to []?.

Your code snippet could be:

time = "1/2/30"
days, hours?, minutes? = time.split('/', 3)
raise "Invalid format" unless hours && minutes

asterite avatar Feb 03 '21 12:02 asterite

I like that proposal @asterite. It still allows for the expressive one-liner but adds some caution for the developer.

I understand the overall desire to avoid introducing nils everywhere, but this kind of usecase is more along the lines of pattern matching. Looking through it from that lens I think makes it ok to introduce nils and forcing the developer to check. This already can return nil for example:

"foo".match(/bar/) # => nil

If this change is something people don't want, though, I would propose perhaps dis-allowing multi-assignment for something like split.

I'm implementing something like the Redis protocol and so I have a lot of strings like "CMD arg1 arg2". It was working great for commands I tested with arguments, but as soon as I introduced the first command without args I got Index out of bounds (IndexError).

mgomes avatar Feb 03 '21 16:02 mgomes

To be honest, I don't think we'll do anything about this. You get an index error whole changing code and that's totally fine.

asterite avatar Feb 03 '21 16:02 asterite

I'm implementing something like the Redis protocol and so I have a lot of strings like "CMD arg1 arg2".

String#split etc. are convenient but not very great for performance-sensitive use cases. For that an efficient custom parser would do much better.

straight-shoota avatar Feb 03 '21 16:02 straight-shoota

With #10410 and #11145, the situation will be as follows:

days, hours, minutes = "1/2/30".split('/')  # okay
days, hours, minutes = "1/2".split('/')     # Unhandled exception: Multiple assignment count mismatch
days, hours, minutes = "1/2/3/4".split('/') # Unhandled exception: Multiple assignment count mismatch

days, *hours_minutes = "1/2/30".split('/')  # okay, hours_minutes == %w(2 30)
days, *hours_minutes = "1/2".split('/')     # okay, hours_minutes == %w(2)
days, *hours_minutes = "1".split('/')       # okay, hours_minutes == [] of String
days, *hours_minutes = "1/2/3/4".split('/') # okay, hours_minutes == %w(2 3 4)

# extra trailing elements are ignored, similar to current behaviour
# (these assignments wouldn't call `#[](3..)` on the intermediate result)
days, hours, minutes, *_ = "1/2/30".split('/')  # okay
days, hours, minutes, *_ = "1/2".split('/')     # Unhandled exception: Multiple assignment count mismatch
days, hours, minutes, *_ = "1/2/3/4".split('/') # okay

HertzDevil avatar Aug 26 '21 23:08 HertzDevil

I faced this problem:

Writing this:

array = [] of String

*first_elements, last_element = array

p first_elements
p last_element

Will raise an exception:

Unhandled exception: Index out of bounds (IndexError)
  from /usr/lib/crystal/indexable.cr:73:20 in '[]'
  from eval:1:5 in '__crystal_main'
  from /usr/lib/crystal/crystal/main.cr:115:5 in 'main_user_code'
  from /usr/lib/crystal/crystal/main.cr:101:7 in 'main'
  from /usr/lib/crystal/crystal/main.cr:127:3 in 'main'
  from /usr/lib/libc.so.6 in '??'
  from /usr/lib/libc.so.6 in '__libc_start_main'
  from /home/crystal/.cache/crystal/crystal-run-eval.tmp in '_start'
  from ???

The expected result is:

[]
nil

The workaround to solve is:

array = [] of String

first_elements = array[0..-2]? # => []
last_element = array[-1]? # => nil

Maybe the splat assignment should handle this case for the user.

stephannv avatar Apr 16 '23 20:04 stephannv

This issue has been mentioned on Crystal Forum. There might be relevant details there:

https://forum.crystal-lang.org/t/multiple-assignment/6943/2

crysbot avatar Jun 18 '24 21:06 crysbot