Daft icon indicating copy to clipboard operation
Daft copied to clipboard

Distributed DataFrame for Python designed for the cloud, powered by Rust

Results 272 Daft issues
Sort by recently updated
recently updated
newest added

**Describe the bug** I use daft `0.2.17` on macOS and would like to add a column from a series with the appropriate length to a concatenated data frame. When I...

bug

Now that we have read support for Delta Lake, we should add support for writing new Delta Lake tables with Daft. - https://delta-io.github.io/delta-rs/api/delta_writer/ - https://github.com/delta-io/delta/blob/master/PROTOCOL.md - https://docs.delta.io/latest/quick-start.html#create-a-table&language-python

data-catalogs
delta-lake

- https://github.com/ChenghaoMou/text-dedup/blob/main/text_dedup/minhash_spark.py - https://github.com/phdinds-aim/alis/blob/68c7f56a08fa5cfe10638ea45292914620c9f5cf/notebooks/lsh-for-minhash/05_demo_minhash_lsh.ipynb - https://github.com/NVIDIA/NeMo-Curator/blob/main/nemo_curator/scripts/fuzzy_deduplication/README.md - https://xorbits.io/blogs/text-deduplicate

**Is your feature request related to a problem? Please describe.** Currently it is not possible to read or write contents of an [Amazon S3 Express One Zone](https://aws.amazon.com/s3/storage-classes/express-one-zone/) directory bucket. The...

Bumps [pytest-cov](https://github.com/pytest-dev/pytest-cov) from 4.1.0 to 5.0.0. Changelog Sourced from pytest-cov's changelog. 5.0.0 (2024-03-24) Removed support for xdist rsync (now deprecated). Contributed by Matthias Reichenbach in [#623](https://github.com/pytest-dev/pytest-cov/issues/623) <https://github.com/pytest-dev/pytest-cov/pull/623>_. Switched docs theme...

dependencies
python

**Describe the bug** If i run `rustup update` to install the toolchain specified in `rust-toolchain.toml`, it does not install `cargo`, which means I can't build the project. **To Reproduce** Steps...

bug
build

We currently have an [optimized multithreaded jsonlines reader](https://github.com/Eventual-Inc/Daft/blob/a0fd6ecaeb1b592fcb0e9cb5b94f3d56d7b73c68/src/daft-json/src/read.rs#L43) for reading from cloud storage that is based rust async/await and tokio. This lets us one of the fastest ways to read...

performance

Bumps [pytest](https://github.com/pytest-dev/pytest) from 7.4.3 to 8.2.0. Release notes Sourced from pytest's releases. 8.2.0 pytest 8.2.0 (2024-04-27) Deprecations #12069: A deprecation warning is now raised when implementations of one of the...

dependencies
python