drone-cache icon indicating copy to clipboard operation
drone-cache copied to clipboard

corrupt archive file (gzip, filesystem)

Open cal101 opened this issue 3 years ago • 5 comments

Describe the bug

I am using the cache plugin to cache an .m2 directory. Too often the cache gets corrupted. Assuming that gnu tar should be able to read the archive it looks to me that the created archive is corrupt:

$ tar ztvf m2p2-cache/.m2 | grep xyz
tar: Removing leading `/' from member names
tar: Unexpected EOF in archive
tar: Error is not recoverable: exiting now

To Reproduce

It's not clear when and how this happens.

The configuration to store the cache is

  - name: rebuild-cache
    image: meltwater/drone-cache:dev
    pull: true
    settings:
      backend: "filesystem"
      rebuild: true
      #      debug: true
      cache_key: "m2p2-cache"
      archive_format: "gzip"
      mount:
        - '.m2'
        - '.p2'
    volumes:
      - name: cache
        path: /tmp/cache
    when:
      status:
        - success
        - failure

Expected behavior Don't create corrupted files.

cal101 avatar Feb 15 '21 11:02 cal101

I ran into this a few times. I believe it happens under two specific conditions:

  1. you're using local filesystem storage
  2. multiple builds try to rebuild caches using the same key at the same time

The problem went away after switching the storage method to minio (S3).

hg avatar Feb 28 '21 12:02 hg

Both conditions were met by my usage.

Smells like file system backend directly writes to target file instead of writing to tmp file first and then do an atomic rename.

I fixed it by writing a simple sftp/scp based caching solution instead.

cal101 avatar Feb 28 '21 15:02 cal101

You are totally right, this is something I have overlooked. This should be easily achievable. Anyone wants to give it a try?

kakkoyun avatar Mar 17 '22 07:03 kakkoyun

@cal101 if you are able to share insight on your solution to this problem I would be happy to take a look at implementing this within drone-cache itself to get this resolved!

bdebyl avatar Jul 19 '22 14:07 bdebyl

@bdebyl Sorry, I didn't saw the notification.

My solution was to not use the cache plugin but write my own solution based on sftp/scp. My assumption about the bug in the cache plugin is that the new archive file is not written atomically and multiple writers write to the same file. A typical solution to that is to let each writer write to it's own file, close it and then finally rename the file to the target name which typically is atomic.

cal101 avatar Nov 21 '22 16:11 cal101