Results 181 issues of Tom Augspurger

The CI for raft-dask includes warnings about unregistered pytest markers: https://github.com/rapidsai/raft/actions/runs/13394517078/job/37411078661#step:10:2656 ``` ../../../pyenv/versions/3.12.9/lib/python3.12/importlib/__init__.py:90 /pyenv/versions/3.12.9/lib/python3.12/importlib/__init__.py:90: PytestUnknownMarkWarning: Unknown pytest.mark.mg - is this a typo? You can register custom marks to avoid this...

improvement
non-breaking
python

As mentioned in https://github.com/zarr-developers/zarr-specs/pull/309, I ran across some challenges with how the Zarr v3 spec does extensions. I think that we might be able to learn some lessons from how...

xref https://github.com/zarr-developers/zarr-specs/issues/316#issuecomment-2407162818: clarifies that extra keys are allowed in the node metadata, they just must have this `must_understand` field set to `false`.

This adds a `type` field to tables being exported to ndjson if it isnt' present. @kylebarron the type on `table` is `table: pa.Table | pa.RecordBatchReader | ArrowStreamExportable`. I wasn't sure...

If I'm reading https://github.com/zarr-developers/geozarr-spec/blob/main/geozarr-spec.md correctly, there are a few types defined by geozarr: - DataArray - Coordinate - Auxiliary Data (should this be "Auxiliary Variable" or just "Auxiliary"? The header...

Inspired by https://github.com/zarr-developers/geozarr-spec/issues/68, would folks be open to adding a [json schema](https://json-schema.org) schema to the repo? It's very helpful to have a somewhat authoritative statement of whether some collection of...

**Describe the bug** The test `python/cudf_polars/tests/test_groupby.py::test_groupby[maintain_order-[(col("key1")) == (col("key2"))]-col("int32").sum()]` fails with a `KeyError` when using a small blocksize / multiple partitions. Here's a simplified example **Steps/Code to reproduce bug** ```python import...

bug
cudf.polars

This fixes the `group_by(...).mean()` with the experimental executor when there are missing values. We were using the length of the column, rather than the number of non-NA elements. Closes https://github.com/rapidsai/cudf/issues/19151

bug
Python
non-breaking
cudf.polars

**Describe the bug** The test `python/cudf_polars/tests/expressions/test_rolling.py::test_rolling_datetime` fails with a small blocksize. **Steps/Code to reproduce bug** ```python import polars as pl from cudf_polars.testing.asserts import assert_gpu_result_equal dates = [ "2020-01-01 13:45:48", "2020-01-01...

bug
cudf.polars

**Describe the bug** When performing a unary op (possible one that consists of just literals?) we get duplicate outputs. **Steps/Code to reproduce bug** ```python import polars as pl ldf =...

bug
cudf.polars