Desmond Cheong issues

Results 19 issues of


                                            Desmond Cheong

[SPARK-47079][PYTHON][SQL][CONNECT] Add Variant type info to PySpark

### What changes were proposed in this pull request? The Variant datatype was added in https://github.com/apache/spark/pull/43707 but the equivalent PySpark type was not added. In this PR we add Variant...

SQL

PYTHON

CONNECT

[PERF] Add a parallel local CSV reader

Adds a parallel CSV reader to speed up ingestion of CSV. The approach adapts some ideas laid out in [1], but the majority of performance gains came from the use...

performance

fix: Allow PNGs to be displayed notebooks

## Changes Made When we generate thumbnails to display in notebooks, we encoded them as JPEG by default. This does not work if images have an alpha channel. This PR...

fix

feat: Add generic interface for custom write sinks

## Changes Made 1. Add a generic `WriteSink` interface. Users can use this to write custom write sinks that have optional `.start()`, `.write()`, `.finish()` methods. 2. Add `DataFrame.write_to_sink()` that takes...

feat

perf: Split projections after optimizer passes

## Summary There are currently three cases where we split projections: - when extracting actor pool projects - when extracting monotonically increasing ids - when extracting window functions In these...

perf

(draft) poc for better UC integration

## Changes Made Was curious as to why our APIs weren't working too hot with UC and took a look. It seems that the `daft.unity_catalog.UnityCatalog` object we pass into `from_unity`...

* and '*' not handled correctly in SQL planner

### Describe the bug ```py import daft df = daft.from_pydict({ "person": [ {"name": "Alice", "age": 30}, {"name": "Bob", "age": 25}, {"name": "Charlie", "age": 35} ] }) daft.sql("SELECT person.'*' FROM df").show()...

bug

sql

Add projection and filter pushups to simplify join reordering algorithm

### Is your feature request related to a problem? NA ### Describe the solution you'd like The join reordering optimizer rule currently does a projection pushup and filter pushup. We...

enhancement

ci: Add a force merge option

## Changes Made Sometimes critical CI fixes are blocked by... CI. Let's force merge our way through.

force-merge

feat(text_embed): Add vLLM as a provider

## Changes Made Adds vLLM as a provider for text embedding. ``` import daft from daft.ai.provider import load_provider from daft.functions.ai import embed_text provider = load_provider("vllm") model = "Qwen/Qwen3-Embedding-0.6B" ( daft.read_huggingface("Open-Orca/OpenOrca")...

feat