iceberg-python icon indicating copy to clipboard operation
iceberg-python copied to clipboard

DOCS: Improve Documentation on Write Support

Open sungwy opened this issue 1 year ago • 9 comments

Feature Request / Improvement

We currently have Write support through two modes:

  1. through the Table API
  2. through the Transaction API

We also have support for different modes of writes like:

  1. append
  2. overwrite
  3. delete

Which make use of different arg parameters.

It would be great to update our docs to describe each of these operations and API modes explicitly in detail as separate subsections, and pad in any missing details (like the use of overwrite_filter in overwrite)

sungwy avatar Aug 06 '24 14:08 sungwy

@sungwy When using overwrite how can we compare fields between source and target? like:

  source.id = target.id and source.updated < target.updated

guitcastro avatar Aug 07 '24 00:08 guitcastro

someone asked about merge/upserts use cases in the slack channel as well, similar to overwrite_filter

kevinjqliu avatar Aug 09 '24 17:08 kevinjqliu

Examples of overwrite_filter

  • https://github.com/apache/iceberg-python/issues/402#issuecomment-2271507538
  • https://github.com/apache/iceberg-python/issues/1020#issuecomment-2274610759

kevinjqliu avatar Aug 13 '24 23:08 kevinjqliu

Hi, I am new to open source and would love to contribute to this issue. Can I work on this?

iamvyshnavi avatar Feb 05 '25 16:02 iamvyshnavi

Hii, I would like to work on this. I can update documentation by adding separate subsections for each write mode(append, overwrite, delete). I can insert some examples like including about overwrite_filter usage. Please let me know if there are any formatting details you would like me to include. Please assign me this issue.

shrutisachan08 avatar Feb 06 '25 03:02 shrutisachan08

This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs. To permanently prevent this issue from being considered stale, add the label 'not-stale', but commenting on the issue is preferred when possible.

github-actions[bot] avatar Aug 06 '25 00:08 github-actions[bot]

Hi! I am new to contributing to PyIceberg and would like to pick up this documentation task to add sections for append, overwrite, and upsert functionality as mentioned by the previous contributors. I will start setting up my environment now. Please consider assigning this to me.

pramila-bishnoi avatar Nov 15 '25 12:11 pramila-bishnoi

Hi ,

I had asked earlier about taking on this documentation issue but didn't hear back, so I went ahead and completed the main task to keep things moving.

The core documentation (adding all write operations to api.md) is now finished.

api.md

(attached file)

I included the full documentation blocks for overwrite(), delete(), dynamic_partition_overwrite(), upsert(), and the Transaction API.

Could you please review the changes I made in the attached api.md file? I want to make sure the content is correct before proceeding.

Proposed Final Change:- I haven't done this yet, but I recommend adding cross-links from the headings (like append()) to the relevant Python API reference documentation. This greatly improves user experience and navigation.

Should I add these cross-links??

Is there anything else you think I should add or change in the file?

Once confirmed, i'll create a new branch, and open a formal PR.

Thanks !

pramila-bishnoi avatar Nov 16 '25 10:11 pramila-bishnoi

@kevinjqliu

pramila-bishnoi avatar Nov 16 '25 11:11 pramila-bishnoi

hey @pramila-bishnoi feel free to open a PR so its easier to review the changes.

kevinjqliu avatar Nov 18 '25 00:11 kevinjqliu

Hi @kevinjqliu , just checking in on this PR. Is there anything else I can provide on my end, or any ETA on when it might be reviewed? Thanks!

pramila-bishnoi avatar Nov 23 '25 11:11 pramila-bishnoi