mojo
mojo copied to clipboard
[stdlib] Refactor slice to use Optionals and fix negative step slices
- Use
Optional
forSlice
start
andend
- Fix negative step slice indexing
- Fix incorrect String unit tests
- Fix
_StridedRange
length with mismatched start and end indices
Fixes #2046 Fixes #1944 Fixes #2142
Could these tests using out of bounds values be added in test_string.mojo
? There should probably be similar tests in test_list.mojo
. The new Slice.adjust()
itself has no tests yet.
assert_equal("", str[-50::-1])
assert_equal("Hello Mojo!!", str[-50::]) # fails
assert_equal("!!ojoM olleH", str[:-50:-1]) # fails
assert_equal("Hello Mojo!!", str[:50:]) # fails
assert_equal("H", str[::50])
assert_equal("!", str[::-50])
assert_equal("!", str[50::-50]) # fails
assert_equal("H", str[-50::50]) # fails
Could these tests using out of bounds values be added in
test_string.mojo
? There should probably be similar tests intest_list.mojo
. The newSlice.adjust()
itself has no tests yet.assert_equal("", str[-50::-1]) assert_equal("Hello Mojo!!", str[-50::]) # fails assert_equal("!!ojoM olleH", str[:-50:-1]) # fails assert_equal("Hello Mojo!!", str[:50:]) # fails assert_equal("H", str[::50]) assert_equal("!", str[::-50]) assert_equal("!", str[50::-50]) # fails assert_equal("H", str[-50::50]) # fails
@mikowals I actually wasn't aware Python had this behaviour, thank you kindly for the review! I've made the necessary changes so the previously failing cases now pass.
Thanks for taking this on @bgreni! It's awesome to see us using types like Optional
to make APIs like this more precise and ergonomic 😄
@bgreni A lot has changed over the past two weeks, could you please rebase this and ensure it passes CI?
@bgreni A lot has changed over the past two weeks, could you please rebase this and ensure it passes CI?
Sorry I figured the imported-internally
tag meant it was out of my hands. I just had to fix fn __getitem__(self, slice: Slice)
for Span
since it was recently introduced but otherwise all good!
!sync
!sync
I have been wrestling with CI about this, and discovered that internally we depend on incorrect behavior in multiple places. This is unfortunate, but on the way I also learned that Python's slice
has an indices
method, which is similar to your _adjust
method; at the same time, Python does not have a __len__
for slices, and I think this makes sense given the ambiguity. I will remove the __len__
method in favor of a new unsafe_indices
. Could you please open a separate PR just for a new Slice.indices
method? Once these both land, we can circle back to the original patch.
@bgreni I removed Slice.__len__
so you can rebase and get those changes after the next nightly. Let me know if you can implement Slice.indices
.
@bgreni I removed
Slice.__len__
so you can rebase and get those changes after the next nightly. Let me know if you can implementSlice.indices
.
Yeah I'll get on that today thanks!
@laszlokindrat Just to clarify is the intent to consolidate the current broken behaviour into a Slice.indices
method that returns a Range
or something with the adjusted start, end, and step values? Then afterwards we return here and integrate those two changes with the corrected behaviour present in this patch?
@laszlokindrat Just to clarify is the intent to consolidate the current broken behaviour into a
Slice.indices
method that returns aRange
or something with the adjusted start, end, and step values? Then afterwards we return here and integrate those two changes with the corrected behaviour present in this patch?
You raise a good point. Since the current slice implementation cannot provide a correct Slice.indices
method, you can just do it as part of this PR. Please make sure you mention it in the changelog, though! BTW the python API returns a tuple of three ints IIRC. Thanks again for working on this, this will be a huge improvement!
@laszlokindrat In working on that I realized slicing a Span
is sort of ill-defined at the moment. Since it can't copy the data, negative and > 1 step values don't work as expected. I think it makes sense for Span
to be sliceable though and I have an idea to give Span
an internal Slice
member that could be used in the indexing behaviour to fix this.
Are you ok with leaving the current behaviour for now and I can fix that after this lands in a following patch?
@laszlokindrat I've updated everything to use the indices()
semantics and I'll rebase on your changes when nightly time hits.
!sync
@laszlokindrat Do we have some way to re-invoke the tests when they fail due to current flakey ones?
@laszlokindrat Do we have some way to re-invoke the tests when they fail due to current flakey ones?
Not sure, maybe you can add an empty commit? We squash and merge internally, so the commit history on your branch is not important.
I attempted to marshal this through again, but it seems our internal uses of slice deeply depend on the current (admittedly incorrect) behavior. My suggestion for a plan forward is the following:
- Introduce a new, corrected slice type with a new name, e.g.
SliceNew
and functions likeslice_new
. - I will internally migrate uses of
Slice
that I can to this new type. - Rename the existing
Slice
toSliceOld
and the newSliceNew
toSlice
(and similarly forslice_old
/slice_new
). - Mark
SliceOld
andslice_old
deprecated.
WDYT?
@laszlokindrat Could be a little confusing for nightly users, but if resolving the internal issues is highly burdensome then that would probably be the best way forward?
@laszlokindrat Could be a little confusing for nightly users, but if resolving the internal issues is highly burdensome then that would probably be the best way forward?
+1 to @laszlokindrat's approach. It's the best approach forward for us internally, as we have a good amount of misuse of it across a variety of teams in the codebase. It'd be good for us to do the incremental approach. In general, it becomes increasingly more complex for our small standard library team to fix the whole company's misuse of certain types and features. Instead, we need to build safe alternatives and incrementally migrate them at times — such as the case here.
Right thanks Joe I understand now. Do want to add an overload __getitem__
for the types in this pr that accepts SliceNew
so nightly users can get the correct behaviour even if they can't use the shorthand syntax for it yet?
Right thanks Joe I understand now. Do want to add an overload
__getitem__
for the types in this pr that acceptsSliceNew
so nightly users can get the correct behaviour even if they can't use the shorthand syntax for it yet?
Maybe, my main ask is that we do this incrementally, since it's clearly gonna be a non-trivial process. So let's just start with a new type + unit tests, then maybe some helpers and phase it in to builtins, and so on. It might also make sense to add implicit conversion between the new and old to help ease the transition.
@laszlokindrat I've opened a new PR with a subset of these changes using the name SliceNew
#2894. I'll close this one but we can always refer back to it when the time comes for additional changes to be introduced.
✅🟣 This contribution has been merged 🟣✅
Your pull request has been merged to the internal upstream Mojo sources. It will be reflected here in the Mojo repository on the nightly branch during the next Mojo nightly release, typically within the next 24-48 hours.
We use Copybara to merge external contributions, click here to learn more.
@bgreni Some heroics were pulled internally, and we were able to land this patch more or less as is. I will clean up SliceNew
and we won't have to phase this in after all. Thanks for being patient with this!
@laszlokindrat Oh wow how exciting! Thank you for all your work on this!
Landed in 3cedf4d4fb5d96df1d61ef7907acf3239059c972! Thank you for your contribution 🎉