mypy icon indicating copy to clipboard operation
mypy copied to clipboard

mypy does not recognize conversion between fixed length list and tuple

Open boompig opened this issue 4 years ago • 9 comments

  • Are you reporting a bug, or opening a feature request?

I believe this is a bug

  • Please insert below the code you are checking with mypy, or a mock-up repro if the source is private. We would appreciate if you try to simplify your case to a minimal repro.

Minimal example to reproduce:

from typing import List, Tuple


def get_point_as_list() -> List[int]:
    return [1, 2]


def point_to_tuple(pt: List[int]) -> Tuple[int, int]:
    assert len(pt) == 2
    return tuple(pt)


if __name__ == "__main__":
    l = get_point_as_list()
    p = point_to_tuple(l)
    print(repr(p))
  • What is the actual behavior/output?

mypy test.py

test.py:10: error: Incompatible return value type (got "Tuple[int, ...]", expected "Tuple[int, int]")
  • What is the behavior/output you expect?

mypy should evaluate this as correct code

  • What are the versions of mypy and Python you are using? Do you see the same issue after installing mypy from Git master?

$ mypy --version mypy 0.720

$ python --version Python 3.7.3

boompig avatar Sep 13 '19 19:09 boompig

This is a design limitation of mypy. Mypy currently doesn't even try to keep track of the lengths of list objects. However, issues similar to this come up every once in a while, and this is something mypy might support at some point in the future.

You can work around this limitation by using a cast or a # type: ignore comment.

JukkaL avatar Sep 16 '19 12:09 JukkaL

Is there an issue for Sequences of fixed length that I can watch for developments? I am also having the same issues and can see a few use cases (e.g., points in Euclidean space, etc). I mostly just use Sequence[<type>, ...] for my own code.

jamesohortle avatar Nov 18 '19 01:11 jamesohortle

We can use this issue to track both fixed-length lists and sequences for now.

JukkaL avatar Nov 18 '19 11:11 JukkaL

I would also appreciate this feature. My use-case is even simpler, converting fixed-length tuples to fixed-length tuples. Specifically I would like this to work:

from typing import Tuple

t: Tuple[int, int] = (0, 0)
t = tuple(x for x in t)

simsa-st avatar Jun 05 '20 12:06 simsa-st

This comes up a lot when you're splitting strings:

    return dict(kv.split('=') for kv in ['a=b', 'cdf=ghi'])

It might be nice to have a special split method on string like split_exact(sep: str, n: int) -> SizedTuple[n, str] that somehow binds the n from the call to the SizedTuple. My proposal is impossible unless the issue of fixed length lists or tuples is solved.

NeilGirdhar avatar Jun 22 '20 22:06 NeilGirdhar

I have run into this as well and wanted to summarize my understanding of the three possible workarounds based on the comments from @JukkaL, @simsa-st, and @NeilGirdhar above.

Rewrite code to avoid lists and generators

A workaround that can be used even in the absence of SizedTuple is to manually write a fixed-size version of the size-naive function you were previously calling. For example, to handle @NeilGirdhar's string splitting use case:

def split_2(s: str, on: str) -> Tuple[str, str]:
    pieces = s.split(on)
    return (pieces[0], pieces[1])

def h() -> Dict[str, str]:
    return dict(split_2(kv, '=') for kv in ['a=b', 'cdf=ghi'])

One caveat here is that the return value of str.split() is not sized, so this will not catch the IndexError thrown when the string you are splitting does not have any = in it. I don't think this kind of IndexError is possible to catch before runtime, though.

Similarly when sorting or doubling every element in a tuple, you can do:

T = TypeVar('T', str, int, float) 

def sort_2(x: Tuple[T, T]) -> Tuple[T, T]:
    if x[0] <= x[1]:
        return x
    return (x[1], x[0])

def double_2(x: Tuple[T, T]) -> Tuple[T, T]:
    #return tuple(y * 2 for y in x)  # can't use generator in mypy
    return (x[0] * 2, x[1] * 2)

def g(x: Tuple[str, str]) -> Tuple[str, str]:
    return sort_2(double_2(x))

In this example sort_2() avoids creating a list unlike the builtin sorted(), and double_2() avoids using a generator (which is I think where @simsa-st's example loses the size information).

In summary, neither lists nor generators have lengths in mypy, but it is often possible to rewrite code so that it avoids using lists or generators and uses only tuples (which can have lengths in mypy).

Ignore the mypy error via comment or cast

An alternative perspective is that sizing is "too much work" or "too unpythonic" to implement this way (we do love generator expressions after all), and so instead of the previous example, we can write (based on @JukkaL's suggestion):

def g_ignore(x: Tuple[str, str]) -> Tuple[str, str]:
    return tuple(sorted(y * 2 for y in x))  # type: ignore

or

def g_ignore(x: Tuple[str, str]) -> Tuple[str, str]:
    return cast(Tuple[str, str], tuple(sorted(y * 2 for y in x)))

These still gain the "benefits" of a sized return type, i. e.

def f() -> str:
    return g_ignore(('t', 's'))[2]

still causes mypy to throw error: Tuple index out of range as expected.

My preliminary understanding is that the main drawback of ignore/cast is that each one you write creates the possibility of false negatives occuring in the future whenever the surrounding code is modified.

Accept the unsized tuple type

If you don't care about catching the index out of range error, you can write:

def g_nosize(x: Tuple[str, ...]) -> Tuple[str, ...]:
    return tuple(sorted(y * 2 for y in x))

which passes mypy but fails to catch the index out of range error in f().

This workaround does not apply to @NeilGirdhar's use case, because dict() does not accept an unsized input.

thomasgilgenast avatar Jul 05 '20 22:07 thomasgilgenast

@thomasgilgenast That's a good summary of what we can do right now, but we don't want to work around this problem. We want mypy to keep track of sequence lengths. We want:

x: Sequence[int]
if len(x) == 2:
    y: Tuple[int, int] = tuple(x)

We want this to happen without any casting or ignoring or auxilliary functions.

NeilGirdhar avatar Jul 05 '20 23:07 NeilGirdhar

This comes up a lot when you're splitting strings:

    return dict(kv.split('=') for kv in ['a=b', 'cdf=ghi'])

Yes, I found this thread after having attempted to simplify

    return {k: v for k, v in (var.split('=', maxsplit=1) for var in env)}

into

    return dict(var.split('=', maxsplit=1) for var in env)

and then saw the Mypy-complaints.

So yet another workaround that passes at least Mypy 0.720 checks is to go via a dictionary comprehension with tuple unpacking like the original form above. That will instead trigger a Pylint warning:

R1721: Unnecessary use of a comprehension (unnecessary-comprehension)

so for Pylint users not much will be won in any case.

dandersson avatar Oct 12 '20 15:10 dandersson

Another similar nice thing to catch would be

def f(c: tuple[list[int], list[int], list[int]]):
    ...

f(tuple([1, 2] for _ in range(3)))

recognising that comprehensions into tuple over range(n) have length n.

TomFryers avatar Sep 08 '22 21:09 TomFryers