Support snapshot_properties in upsert operation
Feature Request / Improvement
I see the other operations (like append, overwrite, delete, etc) has already supported snapshot_properties in arguments. I guess upsert operation should be able to pass this argument too since it internally calls append and overwrite which has supported.
One question after looking into the code, should the upsert operation produce 1 snapshot or 2 snapshots?
I looked into the Spark integration test in iceberg repo and found that Spark will produce only 1 snapshot after running merge into. But from iceberg-python, the upsert operation might run both overwrite and append and produce 2 snapshots.
Just ran into this as well. Seems upsert missing snapshot_properties was likely an oversight, shouldn't be too difficult to add.
Regarding upsert performance, yes I agree it's not ideal that it produces two snapshots. It's also quite slow currently for large tables or a large number of upsert rows. I think there's a separate ticket that touches on both of those issues: https://github.com/apache/iceberg-python/issues/2159. I'm considering implementing my own upsert operation using some of the lower-level APIs to get around the performance issues, as well as supporting upsert + delete in a single operation, which currently requires 2 separate operations and generates 3 snapshots.