zarr-python
zarr-python copied to clipboard
[v3] `LocalStore.get` clarification
in v3, the following function gets bytes for the LocalStore.get
method:
def _get(path: Path, byte_range: Optional[Tuple[int, Optional[int]]] = None) -> bytes:
if byte_range is not None:
start = byte_range[0]
end = (start + byte_range[1]) if byte_range[1] is not None else None
else:
return path.read_bytes()
with path.open("rb") as f:
size = f.seek(0, io.SEEK_END)
if start is not None:
if start >= 0:
f.seek(start)
else:
f.seek(max(0, size + start))
if end is not None:
if end < 0:
end = size + end
return f.read(end - f.tell())
return f.read()
I have a few questions about this function that would help me write tests for it. I think @jhamman @normanrz are the main people I need answers from, but I'd be curious to hear if anyone else has thoughts.
- the annotation of the function signature is
Optional[Tuple[int, Optional[int]]]
, but the body of the function can handletuple[None, int]
. Should we change annotation of the function signature totuple[int | None, int | None] | None
? - I was initially confused about the semantics of
byte_range
-- I thought it defined an interval as a start and a stop, but now I understand thatbyte_range
defines a start and a step size. This would be more transparent if we used keyword arguments likestart
andstep
instead of a tuple. Would switching to keyword arguments be a problem for this API? I think the win in clarity might be worth it.
actually, the second conditional block can handle byte_range[0]
being None
, but this line
end = (start + byte_range[1]) if byte_range[1] is not None else None
from the first conditional block errors if byte_range[0]
is None
, due to adding an int to None
.
- the annotation of the function signature is
Optional[Tuple[int, Optional[int]]]
, but the body of the function can handletuple[None, int]
. Should we change annotation of the function signature totuple[int | None, int | None] | None
?
The current API is:
class ByteRange(NamedTuple):
start: int # Can be negative for indexing from the end
end: Optional[int] = None
def _get(path: Path, byte_range: Optional[ByteRange] = None) -> bytes:
Not sure why we would allow start == None
. Would that be equivalent to start == 0
?
- I was initially confused about the semantics of
byte_range
-- I thought it defined an interval as a start and a stop, but now I understand thatbyte_range
defines a start and a step size. This would be more transparent if we used keyword arguments likestart
andstep
instead of a tuple. Would switching to keyword arguments be a problem for this API? I think the win in clarity might be worth it.
I don't think the byte range defines a step size. Where are you getting that from?
I don't think the byte range defines a step size. Where are you getting that from?
That was my impression from the implementation
end = (start + byte_range[1]) if byte_range[1] is not None else None
Ah, I would call that length
or count
. Anyways, seems you're right. I don't think I ever supplied the second tuple component in zarrita 🤷