iceberg-python icon indicating copy to clipboard operation
iceberg-python copied to clipboard

Apache PyIceberg

Results 402 iceberg-python issues
Sort by recently updated
recently updated
newest added

### Apache Iceberg version None ### Please describe the bug 🐞 As part of the effort to remove `numpy` as a dependency in #1259, we changed `_combine_positional_deletes` function to use...

### Feature Request / Improvement We expose pyiceberg_core's transform functions (powered by iceberg-rust) for partition transform #1074 and #1345 added this functionality for bucket and truncate transform. Let's expand to...

### Question Does pyiceberg allow us to enable adaptive clustering when creating a table or enable it on an existing table? The relevant sql would be something like ```sql ALTER...

stale

### Feature Request / Improvement Based on issues described in #1771 1. We'd want to make it clear that the `default` catalog is used by default when no `--catalog` parameter...

good first issue
stale

Used the minimal required schema for V1 manifest list as described in https://iceberg.apache.org/spec/#manifest-lists `make test` stack trace: ``` ============================================================================ short test summary info ============================================================================ FAILED tests/utils/test_manifest.py::test_read_manifest_list - pyiceberg.exceptions.ResolveError: 504: added_files_count:...

### Feature Request / Improvement I noticed pyiceberg doesn't support user-defined metadata in schema [fields](https://github.com/apache/iceberg-python/blob/main/pyiceberg/io/pyarrow.py#L609). Only doc and key id are set. This makes impossible to use [arrow extension types](https://arrow.apache.org/docs/format/Columnar.html#extension-types)...

stale

This PR resolves #1439 by adding integration tests for the REST Catalog. Functionality testing against the server can be simulated to a certain degree, but some checks are very hard...

### Feature Request / Improvement ## Problem Statement A key problem in distributed Iceberg systems is that commit processes can block each other when multiple workers try to update table...

I'm trying to use pyiceberg within a pod that has access via a role I've configured PYICEBERG_CATALOG__DEFAULT__S3__ROLE_ARN and AWS_ROLE_ARN environment variables but that fails with a Headobject issue ``` File...

This will work as soon as this is merged: https://github.com/fsspec/adlfs/pull/493