delta-rs icon indicating copy to clipboard operation
delta-rs copied to clipboard

fix: compatible to write to local file systems that do not support hard link

Open RobinLin666 opened this issue 1 year ago • 10 comments

compatible to write to local file systems that do not support hard link.

Description

When we write to the local file system, sometimes hard link is not supported, such as blobfuse, goofys, s3fs, so deal with it with compatibility.

It is important to note that: There is another problem with blobfuse, that is, when it comes to rename, it will report errors. Because rename did not release the file handle before. See here for details: https://github.com/delta-io/delta-rs/issues/1765

Arrow-rs is required to cooperate with the modification, for example: https://github.com/GlareDB/arrow-rs/pull/2/files Because object_store has been upgraded to 0.8, there are a lot of breaking change, so I haven't changed this one for the time being. Will fix it after upgrading to 0.8 https://github.com/delta-io/delta-rs/issues/1858

Related Issue(s)

#1765

#1376

Documentation

RobinLin666 avatar Nov 16 '23 05:11 RobinLin666

Thanks for your suggestion, @roeap.

I've added a new option, FILE_ALLOW_UNSAFE_RENAME, to control whether to enable the rename operation. Additionally, I wanted to confirm that the rename operation is atomic and doesn't alter the file's inode. Could you provide more details on the specific scenarios where safety concerns might arise?

Looking forward to your insights.

RobinLin666 avatar Nov 16 '23 10:11 RobinLin666

Certainly ... std::fs::rename will replace the file at the target location if it already exists. The concurrency mechanics in delta heavily rely on that not happening :).

The basic idea to make commits atomic is to write a temporary file, renaming it to the desired commit file, and relying on the file system to error if the file already exists. So in case of just using regular rename, concurrent writers might overwrite each others commits.

roeap avatar Nov 18 '23 18:11 roeap

Yes, if the target already exists, it will replace it. But in this scenario, we require that the target file does not exist, if it does, it will report an error directly, and the next rename operation will not be performed, so I think it is still atomic.

RobinLin666 avatar Nov 19 '23 02:11 RobinLin666

Well, we went over this several times, also with the core team and general consensus was that an a-priori check without any lock mechanism would not prevent any race conditions.

This is especially true if the fs has significant latencies, like for remote stores.

I think you might be able to validate this by adopting the s3 concurrency tests.

roeap avatar Nov 19 '23 02:11 roeap

Yes! You are right! So, what do you think I add an option FILE_ALLOW_UNSAFE_RENAME to support to write to mounted path which do not support hard link.

RobinLin666 avatar Nov 19 '23 03:11 RobinLin666

Hi @roeap, do you have any other concerns? I can modify them at any time. Please feel free to review, thanks!

RobinLin666 avatar Nov 30 '23 11:11 RobinLin666

@RobinLin666 what's the impact if multiple people write at the same time to a mounted storage?

ion-elgreco avatar Nov 30 '23 12:11 ion-elgreco

@RobinLin666 what's the impact if multiple people write at the same time to a mounted storage?

Hi @ion-elgreco Here is the doc of blobfuse, image

writing to different directories only from different instances" will work if you set --use-attr-cache=false and --file-cache-timeout-in-seconds=0. But cannot write to a same blob/file at same time.

Here is some detail: https://github.com/Azure/azure-storage-fuse/issues/366 .

RobinLin666 avatar Dec 01 '23 02:12 RobinLin666

kindly ping

Lyndon1994 avatar Dec 06 '23 13:12 Lyndon1994

Hi team, just wanted to kindly bump this PR for review. Thank you!

RobinLin666 avatar Dec 19 '23 05:12 RobinLin666

it will be really nice to have this functionality as it is very useful in Fabric notebook please :)

djouallah avatar Jan 03 '24 23:01 djouallah

Since delta-rs has been refactored, I have also rearranged the code here. Please review it again. Thank you!

Note: To completely fix this problem, you need to upgrade arrow-rs object_store to version 0.9. I wrote delta table to mounted path in my local test and it passed.

RobinLin666 avatar Jan 06 '24 09:01 RobinLin666

Hi @MrPowers, could you please help to review the PR, since you refined the code last week. Thanks

RobinLin666 avatar Jan 08 '24 06:01 RobinLin666

Hi @roeap / @wjones127 / @rtyler / @fvaleye / @ion-elgreco / @MrPowers Could you please help to review the PR?

it will be really nice to have this functionality as it is very useful in Fabric notebook and in Databricks Notebook (dbfs://) please :)

RobinLin666 avatar Jan 15 '24 01:01 RobinLin666

Hi guys, we are preparing a new Fabric Notebook which has a pure python environment, so we want to depend on this library to provide users with an enjoyable experience to operate Lakehouse Tables, whether through abfss path or local mounted path. So, can you sign off this PR?

RobinLin666 avatar Jan 19 '24 02:01 RobinLin666

This would be a really useful PR for other file systems like HDFS.

jimdowling avatar Jan 19 '24 07:01 jimdowling

Excuse me, Hi @roeap / @wjones127 / @rtyler / @fvaleye / @ion-elgreco / @MrPowers Could you please help to review the PR? Thank you very much! If there is anything that needs to be modified, I will make it promptly

RobinLin666 avatar Jan 24 '24 05:01 RobinLin666

Excuse me, Hi @roeap / @wjones127 / @rtyler / @fvaleye / @ion-elgreco / @MrPowers Does this PR still have a chance to join? If it has a chance, can you please help review, because I have resolved many conflicts, it can't go on like this, right? If it doesn't have a chance, please tell me why, or is there any other plan? So I can do it in time. Because our new project is very dependent on this open-source project. Thanks!

RobinLin666 avatar Feb 02 '24 01:02 RobinLin666

😊

RobinLin666 avatar Feb 07 '24 04:02 RobinLin666

Hi @ion-elgreco can you help to sign off the PR? Thanks!

RobinLin666 avatar Feb 18 '24 05:02 RobinLin666

@RobinLin666 can you update the correct docs pages? The .rst file are the old docs

ion-elgreco avatar Feb 18 '24 08:02 ion-elgreco

@RobinLin666 can you update the correct docs pages? The .rst file are the old docs

Nice to see your reply. I just updated it. Thank you!

RobinLin666 avatar Feb 18 '24 09:02 RobinLin666

Hi @ion-elgreco Have a good day! Is there any problem/ question for now?

RobinLin666 avatar Feb 22 '24 09:02 RobinLin666

HEy @RobinLin666 @ion-elgreco, sorry for being MIA for a while.

Since blob-fuse is a valid use case, I guess we have to do simething about it. generally I was hoping to get rind of all custom file system implementation in delta-rs, but it seems we will have to retain something after all.

One thing though - could we make the config handling analogus to how we handle configuration in object store and also our itnernal config - e.g. table/config.rs?

roeap avatar Feb 22 '24 18:02 roeap

HEy @RobinLin666 @ion-elgreco, sorry for being MIA for a while.

Since blob-fuse is a valid use case, I guess we have to do simething about it. generally I was hoping to get rind of all custom file system implementation in delta-rs, but it seems we will have to retain something after all.

One thing though - could we make the config handling analogus to how we handle configuration in object store and also our itnernal config - e.g. table/config.rs?

Hi @roeap thank you for your suggestion, I wrote it with reference to the implementation of aws, supporting environment variables or passing parameters to pass the configuration. I don't fully understand the effect you want, maybe I'm not very familiar with delta-rs code. I hope you can explain it in more detail. Thank you. If possible, I'd like to go in this time and reconstruct it later.

RobinLin666 avatar Feb 23 '24 08:02 RobinLin666

Hi @roeap ?

RobinLin666 avatar Mar 01 '24 07:03 RobinLin666

Hello @roeap is there any concern ?

RobinLin666 avatar Mar 06 '24 06:03 RobinLin666

Hi @roeap , we greatly rely on this PR as many users strongly demand the ability to write data to the mount point, not only with blobfuse but also potentially with tools like s3fs, Databricks' dbfs, and others. If you believe that the code needs refactoring, could you provide more detailed guidance on how it should be refactored? Alternatively, it would be ideal if we could proceed with this PR first, and then I can prepare a refactoring PR. I prefer not to privately compile a delta-rs installation for our product, as it would be cumbersome to maintain and not practical. Thank you.

RobinLin666 avatar Mar 07 '24 05:03 RobinLin666

@RobinLin666 I think what Robert means it to make this akin to crates/azure/src/config.rs, in parsing and setting configuration. Does that help?

ion-elgreco avatar Mar 09 '24 15:03 ion-elgreco

Hi @roeap / @ion-elgreco , I have refined the code, please help to review, thanks!

RobinLin666 avatar Mar 11 '24 04:03 RobinLin666