drone-cache
drone-cache copied to clipboard
corrupt archive file (gzip, filesystem)
Describe the bug
I am using the cache plugin to cache an .m2 directory. Too often the cache gets corrupted. Assuming that gnu tar should be able to read the archive it looks to me that the created archive is corrupt:
$ tar ztvf m2p2-cache/.m2 | grep xyz
tar: Removing leading `/' from member names
tar: Unexpected EOF in archive
tar: Error is not recoverable: exiting now
To Reproduce
It's not clear when and how this happens.
The configuration to store the cache is
- name: rebuild-cache
image: meltwater/drone-cache:dev
pull: true
settings:
backend: "filesystem"
rebuild: true
# debug: true
cache_key: "m2p2-cache"
archive_format: "gzip"
mount:
- '.m2'
- '.p2'
volumes:
- name: cache
path: /tmp/cache
when:
status:
- success
- failure
Expected behavior Don't create corrupted files.
I ran into this a few times. I believe it happens under two specific conditions:
- you're using local filesystem storage
- multiple builds try to rebuild caches using the same key at the same time
The problem went away after switching the storage method to minio (S3).
Both conditions were met by my usage.
Smells like file system backend directly writes to target file instead of writing to tmp file first and then do an atomic rename.
I fixed it by writing a simple sftp/scp based caching solution instead.
You are totally right, this is something I have overlooked. This should be easily achievable. Anyone wants to give it a try?
@cal101 if you are able to share insight on your solution to this problem I would be happy to take a look at implementing this within drone-cache itself to get this resolved!
@bdebyl Sorry, I didn't saw the notification.
My solution was to not use the cache plugin but write my own solution based on sftp/scp. My assumption about the bug in the cache plugin is that the new archive file is not written atomically and multiple writers write to the same file. A typical solution to that is to let each writer write to it's own file, close it and then finally rename the file to the target name which typically is atomic.