dask
                                
                                 dask copied to clipboard
                                
                                    dask copied to clipboard
                            
                            
                            
                        Type hints for read_bytes
- [x] Passes pre-commit run --all-files
CI failure unrelated
@ian-r-rose thanks for taking a look!
Sorry for the delay in getting back to this @douglasdavis!
Unfortunately, I think you still need to give the overloads defaults in order for type inference to work. For example, on this branch the following fails to type check:
from dask.bytes import read_bytes
reveal_type(read_bytes("path/to/stuff", include_path=True))
Dealing with literals in default keyword-arguments has been a persistently annoying thing in mypy (see the long discussion in https://github.com/python/mypy/issues/6580), but I do think it's doable using the special ellipsis symbol. It's not very well documented, but see here.
So I was able to get it working with this diff:
diff --git a/dask/bytes/core.py b/dask/bytes/core.py
index 0164d024d..8b783cba6 100644
--- a/dask/bytes/core.py
+++ b/dask/bytes/core.py
@@ -15,28 +15,28 @@ from dask.utils import is_integer, parse_bytes
 @overload
 def read_bytes(
     urlpath: str | list[str],
-    delimiter: bytes | None,
-    not_zero: bool,
-    blocksize: str | int | None,
-    sample: int | str | bool,
-    compression: str | None,
-    include_path: Literal[True],
+    delimiter: bytes | None = None,
+    not_zero: bool = False,
+    blocksize: str | int | None = None,
+    sample: int | str | bool = "10 kiB",
+    compression: str | None = None,
+    include_path: Literal[False] = False,
     **kwargs: Any,
-) -> tuple[bytes, list[list[Delayed]], list[str]]:
+) -> tuple[bytes, list[list[Delayed]]]:
     ...
 
 
 @overload
 def read_bytes(
     urlpath: str | list[str],
-    delimiter: bytes | None,
-    not_zero: bool,
-    blocksize: str | int | None,
-    sample: int | str | bool,
-    compression: str | None,
-    include_path: Literal[False],
+    delimiter: bytes | None = None,
+    not_zero: bool = False,
+    blocksize: str | int | None = None,
+    sample: int | str | bool = "10 kiB",
+    compression: str | None = None,
+    include_path: Literal[True] = ...,
     **kwargs: Any,
-) -> tuple[bytes, list[list[Delayed]]]:
+) -> tuple[bytes, list[list[Delayed]], list[str]]:
     ...
 
Thanks, @ian-r-rose! This does some seem like quite the tough case for mypy 😛 with your diff I'm seeing this error from mypy (version: mypy 0.971 (compiled: yes)) if I run mypy dask/bytes/core.py:
dask/bytes/core.py:16: error: Overloaded function signatures 1 and 2 overlap with incompatible return types  [misc]
but the example code snippet you wrote does indeed work. I'm wondering how to make the mypy check on the dask codebase work. At this point I'd be OK with having an Any return typehint if this is getting too in the weeds