iceberg icon indicating copy to clipboard operation
iceberg copied to clipboard

Extend HadoopTableOperations to also work with other FS guarantees

Open gallushi opened this issue 5 years ago • 4 comments

Currently, HadoopTableOperations requires atomic rename from the underlying FS. however, other guarantees such as atomic write can also suffice. this will allow other storage systems (for example, IBM Cloud Object Storage, which supports atomic write) to be used.

gallushi avatar Oct 25 '20 09:10 gallushi

@gallushi, can you describe what you're suggesting in a bit more detail? How would you use atomic write instead? And how would you detect whether to use atomic write or atomic rename?

In general, I would not recommend using a file system for this guarantee. It's better to use a database transaction for the atomic update operation. That's why we want to have support for a variety of catalog plugins in addition to Hive, like JDBC, Nessie, and Glue.

rdblue avatar Oct 28 '20 23:10 rdblue

Hi @rdblue

can you describe what you're suggesting in a bit more detail? How would you use atomic write instead?

Currently, when using hadoop tables, HadoopTableOperations assumes atomic rename guarantees, and (when committing) proceeds to rename the temp snapshot object from its temp name to the snapshot object name. However - atomic write can also be used - instead of using the temp object, we can directly (and atomically) write the snapshot object. the write will succeed iff the object did not exist.

And how would you detect whether to use atomic write or atomic rename?

I think using a config on a scheme level makes sense (this way one can control on which FS to use atomic write and on which ones to stay with atomic rename) since atomic write/ atomic rename are FS level guarantees.

In general, I would not recommend using a file system for this guarantee. It's better to use a database transaction for the atomic update operation. That's why we want to have support for a variety of catalog plugins in addition to Hive, like JDBC, Nessie, and Glue.

Yes; however (also keeping in mind that this is all in the context of hadoop tables, which in any case rely on file system guarantees) - adding support for atomic write is a relatively low-hanging fruit, and storage systems such as IBM Cloud Object Storage can then be used even without an external catalog.

while we discuss this, i'll open a PR with the changes i have in mind.

gallushi avatar Nov 24 '20 11:11 gallushi

@gallushi, that sounds good to me. As long as the file commit is atomic, I think we can work with it.

rdblue avatar Nov 24 '20 18:11 rdblue

This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs. To permanently prevent this issue from being considered stale, add the label 'not-stale', but commenting on the issue is preferred when possible.

github-actions[bot] avatar Feb 28 '24 00:02 github-actions[bot]

This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale'

github-actions[bot] avatar Mar 13 '24 00:03 github-actions[bot]