bug icon indicating copy to clipboard operation
bug copied to clipboard

Seq(1).indexOfSlice(Nil, -2)==-2

Open noresttherein opened this issue 2 years ago • 17 comments

Reproduction steps

Scala version: 2.13.10

Expected: 0, same as in Vector.

What's worse, Vector(1).indexOfSlice(List(1), -2) == -2, too (substitute any Seq for Vector).

noresttherein avatar Apr 30 '23 01:04 noresttherein

@scala/collections

SethTisue avatar Apr 30 '23 02:04 SethTisue

That is actually consistent with the precise or precious Scaladoc.

the first index >= from such that the elements of this sequence starting at this index match the elements of sequence that, or -1 if no such subsequence exists.

som-snytt avatar Apr 30 '23 03:04 som-snytt

The PR trivially adjusts the malicious from. It would be nice if we had some sort of tool to verify that the collections satisfy a set of "laws" of some kind. In lieu of that, I augmented the unit test to exercise both paths (sizes known and unknown).

som-snytt avatar Apr 30 '23 05:04 som-snytt

The PR trivially adjusts the malicious from. It would be nice if we had some sort of tool to verify that the collections satisfy a set of "laws" of some kind. In lieu of that, I augmented the unit test to exercise both paths (sizes known and unknown).

Will it work for Seq(1).indexOfSlice(Nil, 1), which currently returns -1 (but, correctly, 1 for Vector)?

noresttherein avatar Apr 30 '23 06:04 noresttherein

@noresttherein see the PR comment, I went the other way, but I'm not religious about it.

som-snytt avatar Apr 30 '23 07:04 som-snytt

It would be nice if we had some sort of tool to verify that the collections satisfy a set of "laws" of some kind

we do, scala-collection-laws: https://github.com/scala/scala-collection-laws

though the fact that they aren't in-repo leads to neglect

SethTisue avatar Apr 30 '23 14:04 SethTisue

Thanks, Seth, for not calling out my sarcasm. I have a fork and will try again to make headway with it. The sarcasm was really my choice between sarcasm and self-deprecation, much like choosing whether to allow indexOfSlice past the last index, there is a kind of arbitrary equivalence that merely shifts the burden of awareness.

som-snytt avatar Apr 30 '23 16:04 som-snytt

I think it is still wrong: Seq(1).indexOfSlice(Nil, 1) should probably return 1. It does for List.

     if (from0 > l) -1 //non strict
     else if (tl < 1) from0

noresttherein avatar May 01 '23 03:05 noresttherein

List is the default Seq. Did you mean Vector currently returns 1? My comment was that I changed that to work the same as List, but I'm open to whatever the community consensus is. I spent a few minutes with collections-laws, which I now refer to as claws; whatever is decided should be encoded/enshrined there.

scala> Seq(1).indexOfSlice(Nil, 1)
val res0: Int = -1

scala> List(1).indexOfSlice(Nil, 1)
val res1: Int = -1

Edit: an argument in favor of returning the last index plus one is that it is consistent with patch, which is permissive. Also lastIndexOfSlice and startsWith.

I thought indexOfSlice should return legal indices (like indexOf), something I can xs(i), but that's only true for existing elements. So probably I'll flip it to your way of thinking and see how it flies.

Welcome to Scala 2.13.10 (OpenJDK 64-Bit Server VM, Java 19).
Type in expressions for evaluation. Or try :help.

scala> val xs = (1 to 10).toList
val xs: List[Int] = List(1, 2, 3, 4, 5, 6, 7, 8, 9, 10)

scala> xs.indexOfSlice(Nil, 10)
val res0: Int = -1

scala> xs.patch(10, List(42), 100)
val res1: List[Int] = List(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 42)

scala> xs.indices
val res2: scala.collection.immutable.Range = Range 0 until 10

scala> xs.lastIndexOfSlice(Nil, 10)
val res3: Int = 10

scala> xs.startsWith(Nil, 10)
val res4: Boolean = true

som-snytt avatar May 01 '23 06:05 som-snytt

I expect xs.indexOfSlice(Nil, i) to return i for 0 <= i <= xs.size. In particular, definitely i and not -1 when i == xs.size.

More generally, I guess a good "law" would be that xs.indexOfSlice(ys, from) returns some idx >= 0 iff idx is the smallest value such that idx >= from and xs.slice(idx, idx + ys.size) == ys (and -1 if no such idx exists).

sjrd avatar May 01 '23 08:05 sjrd

I find it confusing for an indexOfSomething method to return an invalid positive index, but there can be arguments either way. Ultimately it's a matter of agreeing on a specification. Related to https://github.com/scala/bug/issues/12795#issuecomment-1567878466.

lrytz avatar Jun 02 '23 09:06 lrytz

But for slice, xs.size is not an invalid index. xs.slice(xs.size, xs.size) is perfectly valid. (and making it invalid for the sake of it is actively harmful in practice).

Another way to look at it is that indices never point to elements. Only ranges of indices point to elements. When we say apply(i), we are truly asking for the one element in the range [i, i+1). And that is invalid because i+1 is invalid when i == xs.size. But the range [i, i) is a valid empty range as long as 0 <= i <= xs.size.

sjrd avatar Jun 02 '23 12:06 sjrd

One could argue it's one thing for an operation to accept indices outside the defined range, if it can produce a valid result, like xs.slice(xs.size, xs.size) (or xs.slice(xs.size + N, xs.size +N) for that matter - is there anything special about i == xs.size?), but it's another thing for an operation to return an index that's outside the defined range.

But I see your point, Nil.indexOfSlice(Nil) is an example that can probably convince me, we'd have to make that reuturn -1.

lrytz avatar Jun 02 '23 13:06 lrytz

It's common practice in other languages to accept ranges that stop right at the length of the collection, but not go further. Java and C++ have very consistent APIs like that, for example.

sjrd avatar Jun 02 '23 14:06 sjrd

I am persuaded by sjrd's argument about open ranges.

Because I am weak-minded, I am also persuaded by lrytz's question about what makes size + 1 different from size.

My answer to that is:

scala> val xs = 1.to(10).toList
val xs: List[Int] = List(1, 2, 3, 4, 5, 6, 7, 8, 9, 10)

scala> xs.indexOfSlice(Nil, 10)
val res0: Int = 10

scala> xs.indexOfSlice(Nil, 11)
val res1: Int = -1

scala> xs.lastIndexOfSlice(Nil)
val res2: Int = 10

scala> xs.slice(11, 11)
val res3: List[Int] = List()

The permissive inputs at res3 do not imply that res1 should be 11, because res1 must be consistent with res2.

Another permissive input, xs.patch(11, List(42), 100), is likewise not relevant.

As a footnote, there is not a method for xs.slice(5, 6).headOption but there is xs.lift(5).

som-snytt avatar Jun 02 '23 16:06 som-snytt

I think that it is one thing is to be lax about the arguments you accept, and quite another to be lax about the values you return. At least it seemed to me like something universally accepted as the best practice (doesn't mean it's 'right', just that it seems most people would agree). On the other hand, size as indexOfSlice(Nil) feels ok, because it still 'borders the element': in Scala, and all C-derived languages, the 'end' or 'until' index is always exclusive: no matter how restrictive we would want to be, we have to allow it as an end of a range. And so, [size,size) is a valid subrange of [0, size); [size + 1, size +1) isn't. Anyway, indices, pointers and iterators don't really 'point' to elements, they point between them - at least that way of thinking makes sense when you do pointer arithmetic (including iterator.drop).

The problem is, Scala is very inconsistent with it. Especially mutable vs immutable collections: all mutable methods (at least as I recollect) are not permissive, for example Buffer.remove (and you could expect that the same logic applies to it as with slice/drop). Array.copyOf(a, -1) throws a NegativeArraySizeException. copyToArray cannot even decide one way or the other. On the other hand, if patch is permissive, why not updated? Index falls out of range, so you ignore it; a valid result can be produced. It is not, because probably it felt like a bad idea (and I agree). Now, as I stated before, I am not super confident that permissive indices are a good idea in the first place (most languages I had contact with were strict, especially statically typed), or that the way they are implemented cough patch cough is the best, but it's not a surprising decision. However, I do not report these issues simply because I disagree, but because I was truly surprised, despite knowing Scala quite well.

And I haven't even brought up yet how disturbing to me is the complete disregard towards consistency in naming of the arguments ;)

noresttherein avatar Jun 13 '23 16:06 noresttherein

@noresttherein I think the PR follows what you say.

I can't find my comment, but somewhere I distinguished the mutating API as requiring an "existing element", i.e., a good index. That explains why a mutator treats indices differently (from sane immutable API, of course).

I also conjecture that patch should be called patched and patchInPlace should be patch, but the other question is whether they may behave differently (with respect to inputs) based on that distinction.

"I changed patch to patchInPlace and now it throws."

som-snytt avatar Jun 13 '23 17:06 som-snytt