mypy icon indicating copy to clipboard operation
mypy copied to clipboard

Unexpected "error: Slice index must be an integer or None"

Open sametmax opened this issue 8 years ago • 33 comments

Despite having:

    class g():
    ...
        def __getitem__(self, index):
            # type: (Union[int, slice, Callable]) -> Union[IterableWrapper, Any]

Running mypy triggers "error: Slice index must be an integer or None" for g()[lambda x: x > 4:].

sametmax avatar Nov 07 '16 10:11 sametmax

You did not pass a callable to __getitem__. You pass a slice to __getitem__, but this slice is built with a lambda instead of an integer. You do not control the signature of slice.__init__.

elazarg avatar Nov 07 '16 13:11 elazarg

Oh, right, so we need to update the typeshed for slice ? Indeed, I can't imagine other people won't have this problem. Numpy comes to mind, since it uses slices containing tuples to mimic multi-dimensional slicing.

sametmax avatar Nov 07 '16 14:11 sametmax

I don't know. Passing a callable to slice instead of an integer is most likely an error:

elems[:x.count]  # oops, should be x.count()

elazarg avatar Nov 07 '16 14:11 elazarg

Maybe slice() needs to become generic?

--Guido (mobile)

gvanrossum avatar Nov 07 '16 17:11 gvanrossum

Sounds good to me. It might make existing annotation less safe in the short term, though (since slice will mean slice[Any]).

elazarg avatar Nov 07 '16 17:11 elazarg

I think that somebody proposed making slice generic some time a ago, but it was rejected because we didn't have use case.

JukkaL avatar Nov 07 '16 17:11 JukkaL

Found it: https://github.com/python/typing/issues/159

elazarg avatar Nov 07 '16 17:11 elazarg

Perhaps we should have Slice[T], and slice may be an alias for Slice[int].

elazarg avatar Nov 07 '16 17:11 elazarg

That sounds reasonable to me, as potentially 90+% of slice objects have integer indices, and it would also be backward compatible.

JukkaL avatar Nov 07 '16 17:11 JukkaL

@elazarg: I'd go for your solution. Indeed, my need to pass a callable to the slice is a niche, and numpy slicing is not very comon, so this allow the best of both worlds. The reason I allow callables into slices is that I am creating an object you can slice to do the equivalent of itertools.dropwhile and takewhiles as a shorcut.

sametmax avatar Nov 07 '16 18:11 sametmax

(Just as a side note, there's a discussion in python-ideas of using slice[] as a slice constructor: https://mail.python.org/pipermail/python-ideas/2016-November/043630.html)

elazarg avatar Nov 12 '16 19:11 elazarg

I work on several projects in the scientific computing space that use slice objects with non-int arguments.

I don't like implicitly assuming that slice means a slice of integers, because this would be surprising to those of us who really want a generic slice object.

In NumPy, it's natural to pass NumPy integers into a slice, which do not subclass int (e.g., because they have fixed width), but otherwise are duck types that work almost exactly the same. The proper slice type for NumPy arrays would have elements of type SupportsIndex (i.e., values that implement __index__).

In pandas and xarray, slice objects can be used with non-integer values, e.g., for indexing along an axis with string labels. But there are no valid string strides, so instead we accept integer strides with the usual meaning. The fully specified valid slice type for a StringIndex would be something like Slice[str, str, int], where each field is presumed optional. At the very least, we need separate types for start/stop and step.

shoyer avatar Nov 13 '16 01:11 shoyer

So we should probably just bite the bullet and make Slice generic. It doesn't look like it'll be too hard. Elazar, do you want to be point on this?

gvanrossum avatar Nov 13 '16 01:11 gvanrossum

I am not entirely sure what is the decision.

Should we write the type of slice everywhere as Slice[int, int, int]?

elazarg avatar Nov 14 '16 15:11 elazarg

Hopefully it can just be Slice[int]? There shouldn't be too many places where this is needed.

Or do you really want to support Slice[datetime, datetime, timedelta]? (The only example I can think of where the type of the difference isn't the same as the base value type.)

gvanrossum avatar Nov 14 '16 22:11 gvanrossum

@shoyer just gave another example - panda's Slice[str, str, int].

I think it would be nice to have Slice[T1, T2, T3], and Slice[T] as a shorthand for Slice[T, T, int]. I don't know if that's expressible.

elazarg avatar Nov 14 '16 23:11 elazarg

The current rules for generics don't allow that.

gvanrossum avatar Nov 14 '16 23:11 gvanrossum

Meanwhile, there are much more mundane ways to trigger this message:

$ cat slicetest.py
lo, hi = None, 2

print('abc'[lo:hi])

$ python slicetest.py 
ab
$ mypy slicetest.py
slicetest.py:3: error: Slice index must be an integer or None

The message says "... integer or None" but visit_slice_expr seems to allow int only.

dckc avatar Mar 05 '17 21:03 dckc

@dckc You're using --strict-optional right?

gvanrossum avatar Mar 05 '17 21:03 gvanrossum

Yes, I suppose I am.

dckc avatar Mar 05 '17 22:03 dckc

I wanted to add just another usecase of slice having non-int arguments. In pandas you quite usually have time index. So another valid case is having e.g. datetime.date in slice:

In [1]: import datetime

In [2]: import pandas as pd

In [3]: df = pd.DataFrame(
   ...:  data = list(range(10)),
   ...:  index = [datetime.date.today() + datetime.timedelta(days = i) for i in range(10)]
   ...: )

In [4]: df
Out[4]: 
            0
2018-03-08  0
2018-03-09  1
2018-03-10  2
2018-03-11  3
2018-03-12  4
2018-03-13  5
2018-03-14  6
2018-03-15  7
2018-03-16  8
2018-03-17  9

In [5]: df[datetime.date(2018, 3, 12):]
Out[5]: 
            0
2018-03-12  4
2018-03-13  5
2018-03-14  6
2018-03-15  7
2018-03-16  8
2018-03-17  9

Artimi avatar Mar 08 '18 08:03 Artimi

This can also manifest itself in errors message like No overload variant of "slice" matches argument types "str", "str" if you write a slice literals like slice('a', 'b').

shoyer avatar Jan 06 '19 08:01 shoyer

On the off chance that additional non-scientific use cases are required, I have a library that would also like to use slice syntax to represent a timeline:

https://github.com/eblume/hermes/blob/4b42872d1b87216394e30463b3c8946c2f2013c5/hermes/timespan.py#L51-L69

I am admittedly still a mypy newcomer and might be making some basic mistakes there. Notably, mypy typechecks this without error, but fails a typecheck of a call to this slice syntax in another package importing the above linked code.

eblume avatar May 14 '19 21:05 eblume

This actually triggers on something like

from typing import Any
  
x: Any
x['a':'b']

which is clearly a bug.

ilevkivskyi avatar Mar 02 '20 13:03 ilevkivskyi

As @Artimi mentioned, there're many use cases in pandas that a time index is involved. Would everyone be happy if we add the datetime type?

alvis avatar Mar 25 '21 09:03 alvis

datetime seems limited.

A slice is represent a boundary, but boundaries can be established with practically anything.

In fact, I quite like the idea of being able to pass a callable to __getitem__ mentioned by the OP: it lets you say "start to take elements when this condition arise, or stop taking elements when this one arises. You could even make a wrapper object like:


iterable = slicer(generator)
for x in iterable[lambda, lambda]:

It's much more elegant than going for itertools.dropwhile and takewhile.

I think slice() should allow library writers to do such experiments.

ksamuel avatar Mar 28 '21 21:03 ksamuel

So this issue is now starting to crop up on a semi-regular basis in the likes of numpy, scipy and pandas, due to use of numpy's fixed-size integer types (that do not inherit from builtins.int).

It would already be a massive improvement if the rather restrictive None | int requirement for slice could be changed into None | SupportsIndex. Since this type of slice-signature is already supported by the builtin sequence types, this should not lead to any loss in type safety.

In [1]: class Index:
   ...:     def __init__(self, value: int) -> None:
   ...:         self.value = value
   ...:
   ...:     def __index__(self) -> int:
   ...:         return self.value
   ...:

In [3]: a = range(10)
   ...: a[Index(0):Index(5)]
Out[3]: range(0, 5)

In [4]: b = '0123456789'
   ...: b[Index(0):Index(5)]
Out[4]: '01234'

In [5]: c = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
   ...: c[Index(0):Index(5)]
Out[5]: [0, 1, 2, 3, 4]

BvB93 avatar Jun 17 '21 09:06 BvB93

I can take a look at the implications of such a suggested change. I do think that allowing SupportsIndex is a natural thing to do.

emmatyping avatar Jun 17 '21 16:06 emmatyping

python/typing#159 is closely related.

JelleZijlstra avatar Jun 17 '21 17:06 JelleZijlstra

Just # type: ignore[misc]

eggplants avatar Jan 31 '23 04:01 eggplants