spatialdata-io icon indicating copy to clipboard operation
spatialdata-io copied to clipboard

PyArrow OSError: Unexpected end of stream

Open LucaMarconato opened this issue 1 month ago • 2 comments

Since the release of pyarrow 22.0.0 the CI started failing, with errors similar to the one below. Please note that the error doesn't seem to be spatialdata-related. Full CI run here.

tests/test_xenium.py:213: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
src/spatialdata_io/_utils.py:59: in wrapper
    return f(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^
src/spatialdata_io/readers/xenium.py:192: in xenium
    return_values = _get_tables_and_circles(path, cells_as_circles, specs, gex_only)
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
src/spatialdata_io/readers/xenium.py:570: in _get_tables_and_circles
    metadata = pd.read_parquet(path / XeniumKeys.CELL_METADATA_FILE)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
/opt/hostedtoolcache/Python/3.13.7/x64/lib/python3.13/site-packages/pandas/io/parquet.py:669: in read_parquet
    return impl.read(
/opt/hostedtoolcache/Python/3.13.7/x64/lib/python3.13/site-packages/pandas/io/parquet.py:265: in read
    pa_table = self.api.parquet.read_table(
/opt/hostedtoolcache/Python/3.13.7/x64/lib/python3.13/site-packages/pyarrow/parquet/core.py:1899: in read_table
    return dataset.read(columns=columns, use_threads=use_threads,
/opt/hostedtoolcache/Python/3.13.7/x64/lib/python3.13/site-packages/pyarrow/parquet/core.py:1538: in read
    table = self._dataset.to_table(
pyarrow/_dataset.pyx:589: in pyarrow._dataset.Dataset.to_table
    ???
pyarrow/_dataset.pyx:3969: in pyarrow._dataset.Scanner.to_table
    ???
pyarrow/error.pxi:155: in pyarrow.lib.pyarrow_internal_check_status
    ???
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

>   ???
E   OSError: Unexpected end of stream

pyarrow/error.pxi:92: OSError

The pyarrow release also broke some tests in spatialdata, but these have been already fixed here: https://github.com/scverse/spatialdata/pull/1003, and a release will be made soon.

LucaMarconato avatar Oct 27 '25 16:10 LucaMarconato