Jay Chia issues

Results 70 issues of


                                            Jay Chia

More informative user-facing errors for SQL parsing of joins

### Describe the bug I am trying to do this: ``` import daft df1 = daft.from_pydict({"a": [1, 2, 3], "b": ["foo", "bar", "baz"]}) df2 = daft.from_pydict({"a": [1, 2, 3], "c":...

bug

needs triage

[ActorPoolProject] Pipeline of multiple actor pool projects throttles later stages if earlier stages have low concurrency

**Describe the bug** When running the following code: ```python import daft import os import time @daft.udf(return_dtype=daft.DataType.string()) class MySlowUdf: def __init__(self): print(f"I am process ({os.getpid()}), initializing...") time.sleep(10) print(f"I am process ({os.getpid()}),...

bug

Add fp16 type support

**Is your feature request related to a problem? Please describe.** Add fp16 as a Daft type cc @conceptofmind

[ActorPoolProject] Correctly allocate GPUs for each running Actor in the PyRunner

**Describe the bug** When I run stateful UDFs with GPU requests, I would require Daft to correctly assign each stateful UDF. For example, if my process starts with `CUDA_VISIBLE_DEVICES=3,4,5,6`, and...

[DOCS] Changing docs for UDF

1. Improves UDF documentation by adding API pages for `StatefulUDF` and `StatelessUDF`, with lots of docstrings and examples 2. Moves our `daft.udf` module to `daft.udfs` instead, which avoids a naming...

documentation

[FEAT] Assign CUDA_VISIBLE_DEVICES to actor pools in PyRunner

This PR assigns each actor that is spun up by the Python ActorPoolProject access to only certain GPUs, based on its rank. TODO: We should handle chains of model inference...

enhancement

[SQL] Full support for all scalar expressions

Cannot write fixed size list (and by extension, fixed size tensors and fixed size images) with null values to Parquet

**Describe the bug** To reproduce: ``` import daft import numpy as np df = daft.from_pydict({"x": [np.array([1, 2, 3]), None, np.array([1, 2, 3])]}) df = df.with_column("y", df["x"].cast(daft.DataType.fixed_size_list((3,), daft.DataType.int64()))) df.write_parquet("foo") ``` Related...

bug

`llm_generate` OpenAI provider does not work

### Describe the bug I am having a lot of trouble using the `llm_generate` functionality with the OpenAI provider. I think the actor process being created does not inherit the...

bug

help wanted

p2 (backlog)

Support extremely flexible list datatype declarations

### Is your feature request related to a problem? Writing datatypes in Daft today happens in a few places: 1. UDF return types (`return_dtype=...`) 2. .apply return types (`.apply(..., return_dtype=...)`)...

enhancement

types