Jay Chia
Jay Chia
This limits the depth of the "pipeline" when Daft executes `df.iter_partitions` partitions. The main refactor made here is that our PhysicalPlan is no longer responsible for keeping track of progress...
**Is your feature request related to a problem? Please describe.** In Azure, sometimes users may use Azure managed identity instead of just pure credentials. Daft should support this. This may...
**Describe the bug** When table formats such as Iceberg and Delta Lake store the data for a partition column, they will strip the column from the actual Parquet data files...
- [ ] Ability to override only specific partitions in the table (instead of whole table) - [ ] Ability to write to a partitioned Iceberg Table
**Is your feature request related to a problem? Please describe.** Currently Daft makes an assumption that all files being retrieved from a given Iceberg table has the same partitioning: 1....
**Is your feature request related to a problem? Please describe.** The current behavior is to log a warning, but we should perhaps just automatically use the Ray Runner if we...
**Describe the bug** Daft's local Parquet reader is slow when reading Parquet files with many small rowgroups. The Polars Parquet writer currently writes files like that (attached a sample file...
**Is your feature request related to a problem? Please describe.** We would like to add more expressions and kernels for functionality to eventually have parity with Ibis (https://ibis-project.org/reference/expression-numeric). **Generic** https://ibis-project.org/reference/expression-generic...
**Is your feature request related to a problem? Please describe.** When reading certain Azure Blob Store storage accounts that are non-hierarchical, Daft fails with FileNotFound. See: https://learn.microsoft.com/en-us/azure/storage/blobs/data-lake-storage-namespace More context: #1849