filesystem_spec icon indicating copy to clipboard operation
filesystem_spec copied to clipboard

DirFileSystem does not propagate transaction context to underlying filesystem

Open patrickwolf opened this issue 10 months ago • 0 comments

When wrapping a transactional file:// backend in DirFileSystem (e.g. filesystem("dir", …, fs=filesystem("file"))), entering with fs.transaction: on the dir:// wrapper sets only the wrapper’s _intrans flag. Because DirFileSystem never delegates its transaction context down to the wrapped LocalFileSystem, all writes (even via fs.open(..., "wb")) commit immediately rather than being deferred to temp files and renamed on commit.

Steps to Reproduce

Create a transactional file:// filesystem and wrap it in dir://:

import os, tempfile, fsspec

tmp = tempfile.mkdtemp()
base = fsspec.filesystem("file")  
fs   = fsspec.filesystem("dir", path=tmp, fs=base)

Enter a transaction on the dir:// wrapper and write a file:

with fs.transaction:
    with fs.open("data.txt", "wb") as f:
        f.write(b"hello")
    exists_inside = os.path.exists(os.path.join(tmp, "data.txt"))
    print("Exists inside transaction?", exists_inside)

Observe that exists_inside is True, even though no commit has occurred yet.

Expected Behavior

  • Inside the with fs.transaction: block, no file should appear on disk.
  • After exiting the block without errors, the file should be atomically renamed into place.
  • On error, no partial files should remain.

Actual Behavior

  • The file is created on disk immediately, inside the transaction block.
  • There is no deferral to a temp file, and no atomic rename on commit

Proposed Fix

Override DirFileSystem.transaction to delegate to self.fs.transaction, so that the wrapper and wrapped FS share the same transaction context and _intrans flag.

Minimal Repro Code

import os, tempfile, fsspec

tmp  = tempfile.mkdtemp()
base = fsspec.filesystem("file")
fs   = fsspec.filesystem("dir", path=tmp, fs=base)

print("Before transaction:", base._intrans, fs._intrans)

with fs.transaction:
    print("Inside transaction:", base._intrans, fs._intrans)
    with fs.open("data.txt", "wb") as f:
        f.write(b"hello")
    # Should be False, but is True
    print("Exists inside?", os.path.exists(os.path.join(tmp, "data.txt")))

print("Clean up")
os.remove(os.path.join(tmp, "data.txt"))
os.rmdir(tmp)

Output

Before transaction: False False Inside transaction: False True Exists inside? True Clean up

Environment

fsspec version: 2025.3.2

patrickwolf avatar Apr 23 '25 03:04 patrickwolf