beets
beets copied to clipboard
Preserve mtimes of files extracted from zip archives
Importing a disc from a zip file while having the importadded set results in file times not corresponding to the content of the zip archive itself, the files times are current. Repeating the same import on a folder extracted from the zip file using the unzip tool works fine.
I'm using 1.6.0 on Fedora 36 (self compiled as the distro still has 1.4.9).
I can provide more information, but the issue is quite easy to reproduce as I explained above.
this is the relevant section of my config file:
importadded:
preserve_mtimes: yes
preserve_write_mtimes: yes
Interesting! I think the first step to understanding this would be to know whether the library we use to extract these files, namely the zipfile.extractall function, can preserve mtimes. If it does, then we would need to find where we're discarding those. If it doesn't, then we may be out of luck. Any chance you'd be able to investigate?
Mmmh. It doesn't seem to work properly. I tested it by running:
python -m zipfile -e zz.zip zz/
and the time of the extracted files is now.
I found the following, so it seems a limitation of pythons implementation of zip extraction:
https://stackoverflow.com/questions/9813243/extract-files-from-zip-file-and-retain-mod-date
Would it be possible to adopt one of the proposed solutions? Since it's for internal use I don't see much possibly going wrong by changing the time by hand after the extraction.
Nice find, @arogl! It seems technically possible, but somewhat annoying to implement because we can no longer just use that extractall function… but I'll mark this as a feature request in case anyone is interested in giving it a shot.
I haven't tested if the functionality works for RAR files, but since RAR unpacking it is implemented by directly calling an external utility (unrar) maybe that's the case.
I will try to look at all file extraction from archives over the weekend.
I was thinking of wrapping the time setting while only the preserve options enabled
@sampsyo
Could this work?
In importer.py#L1080 add
# From here:
# https://stackoverflow.com/questions/9813243/extract-files-from-zip-file-and-retain-mod-date
# fixing #4392
def RestoreTimestampsOfArchiveContents(archivename, extract_dir):
for f in archivename.infolist():
# path to this extracted f-item
fullpath = os.path.join(extract_dir, f.filename)
# still need to adjust the dt o/w item will have the current dt
date_time = time.mktime(f.date_time + (0, 0, -1))
# update dt
os.utime(fullpath, (date_time, date_time))
Then at importer.py#L1093:
if (config['preserve_mtimes'].get(bool)):
RestoreTimestampsOfArchiveContents(archive, extract_to)
I have not thought too much about PY2 example v. PY3, nor the if config
Yes, something like this could work! With the caveat that the preserve_mtimes option is located within the importadded configuration—not at the top level of config.
By doing it this way are we setting the times also in the cases where they are already set by the unarchiver? Just curious.
By doing it this way are we setting the times also in the cases where they are already set by the unarchiver? Just curious.
At the moment every extraction, regardless of type.
Further testing to be done
Initial change pushed #4396