Jay Chia

Results 70 issues of Jay Chia

This PR supports `__init__` arguments to be supplied to StatefulUDFs. Users can now define arguments for `__init__` in their StatefulUDFs, and additionally tweak the arguments at runtime by calling `MyUDF.with_init_args(...)`

1. Adds deprecation warning for `.with_column(resource_request=...)` kwarg 2. Adds a new mechanism for specifying resource requests directly on UDFs (in `@udf` or by overriding with `my_udf.override_options(...)`) 3. At the physical...

documentation
enhancement

TODOs: **General** - [ ] Properly propagate the num_actors to the physical plan from some user-facing API (TBD) - [ ] Fix the translation logic to correctly account for arbitrarily...

enhancement

**Describe the bug** Sometimes when writing/reading back embedding or tensor types from Parquet we get weird misalignment issues

bug

**Is your feature request related to a problem? Please describe.** For large-scale anti-joins, we can speed it up by not performing an expensive repartition on both sides. Since the LHS...

p0

**Is your feature request related to a problem? Please describe.** ``` df = df.from_pydict({"foo": [1, 2, 3, 3, 3], "bar": ["a", "a", "b", "b", "b"]}) # should return a new...

**Is your feature request related to a problem? Please describe.** Currently Stateful UDFs are initialized once per execution of a UDF, instead of once per worker initialization. This means that...

p0