DOCS: Improve Documentation on Write Support
Feature Request / Improvement
We currently have Write support through two modes:
- through the Table API
- through the Transaction API
We also have support for different modes of writes like:
- append
- overwrite
- delete
Which make use of different arg parameters.
It would be great to update our docs to describe each of these operations and API modes explicitly in detail as separate subsections, and pad in any missing details (like the use of overwrite_filter in overwrite)
@sungwy When using overwrite how can we compare fields between source and target? like:
source.id = target.id and source.updated < target.updated
someone asked about merge/upserts use cases in the slack channel as well, similar to overwrite_filter
Examples of overwrite_filter
- https://github.com/apache/iceberg-python/issues/402#issuecomment-2271507538
- https://github.com/apache/iceberg-python/issues/1020#issuecomment-2274610759
Hi, I am new to open source and would love to contribute to this issue. Can I work on this?
Hii, I would like to work on this. I can update documentation by adding separate subsections for each write mode(append, overwrite, delete). I can insert some examples like including about overwrite_filter usage. Please let me know if there are any formatting details you would like me to include. Please assign me this issue.
This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs. To permanently prevent this issue from being considered stale, add the label 'not-stale', but commenting on the issue is preferred when possible.
Hi! I am new to contributing to PyIceberg and would like to pick up this documentation task to add sections for append, overwrite, and upsert functionality as mentioned by the previous contributors. I will start setting up my environment now. Please consider assigning this to me.
Hi ,
I had asked earlier about taking on this documentation issue but didn't hear back, so I went ahead and completed the main task to keep things moving.
The core documentation (adding all write operations to api.md) is now finished.
(attached file)
I included the full documentation blocks for overwrite(), delete(), dynamic_partition_overwrite(), upsert(), and the Transaction API.
Could you please review the changes I made in the attached api.md file? I want to make sure the content is correct before proceeding.
Proposed Final Change:- I haven't done this yet, but I recommend adding cross-links from the headings (like append()) to the relevant Python API reference documentation. This greatly improves user experience and navigation.
Should I add these cross-links??
Is there anything else you think I should add or change in the file?
Once confirmed, i'll create a new branch, and open a formal PR.
Thanks !
@kevinjqliu
hey @pramila-bishnoi feel free to open a PR so its easier to review the changes.
Hi @kevinjqliu , just checking in on this PR. Is there anything else I can provide on my end, or any ETA on when it might be reviewed? Thanks!