UnzipKit icon indicating copy to clipboard operation
UnzipKit copied to clipboard

File deletion should happen in-place

Open abbeycode opened this issue 10 years ago • 4 comments

The -deleteFile:error: method currently copies all files from the archive to a new archive, excluding the file being deleted. It would be better to update the archive in-place, removing the data of the deleted file, and updating the central directory entry.

This causes it to be much slower. Not only is it a file copy, but because that can take a while, it's only done after checking that the file exists in the archive, which for an archive with many files, could also take a while.

This is the right way to do it, but it's complicated to actually achieve, since it can't use the MiniZip wrapper to achieve this.

The ZIP specification, and a hex editor are invaluable resources for making this change.

abbeycode avatar Dec 23 '14 19:12 abbeycode

it's slower but atomicity is a problem specially in mobile platform. An interruption may result in corrupt file.

amosavian avatar Feb 14 '16 08:02 amosavian

That's an interesting point. I think that it's a concern for any sort of file I/O, though – not only deletions, right? If you begin inserting into an archive, and quit midway through, there'll still be corrupt data (a header that points to an incomplete file). Wouldn't it be on the developer to make sure the process runs long enough to finish?

The first (and easiest) way I could see around this would be to have two methods (or perhaps a flag) that would allow a consumer of the library to make a choice about implementation (existing or delete-in-place), but I'm not sure this is a common enough concern to warrant the added complexity.

The other way I could see would be to keep some sort of log during the operation that a consumer could then pass back in to finish when the app picks back up. The log might look like this (in human terms, I think the actual log would be more geared toward easy/quick reads and writes):

UnzipKit v1.8.3
Deleting File A from path/to/archive.zip
File A header address 0x0456A2
File A size 123
Copying 128 bytes from 0x0456B2 to 0x0456A2
Copied 128 bytes from 0x0456B2 to 0x0456A2
Copying 128 bytes from 0x0456B3 to 0x0456A3
...
Finished deleting File A from path/to/archive.zip

As I write about it, I'm starting to like this approach. Make the delete calls asynchronous, and return back the NSURL of the log file for the operation, then indicate success in a callback block. If the calling application never receives that callback block, then it should call resumeDelete: on next launch passing the NSURL returned from the initial call.

I'd love to hear feedback from @amosavian and anyone else who's interesting in a faster delete operation. I'm also getting familiar enough with the ZIP spec that I might be ready to tackle this sometime.

abbeycode avatar Feb 14 '16 16:02 abbeycode

Unfortunately I'm neither familiar with ZIP spec nor Objective C language otherwise I could help you in this milestone. Having journaled is a fascinating idea. It may help much in modifying functions but I think that would be a hard job to do.

amosavian avatar Feb 15 '16 17:02 amosavian

@amosavian No worries, it's something I'll try to get to at some point. I added a link to the ZIP spec to the top post for the issue though, in case anyone gets curious.

abbeycode avatar Feb 18 '16 19:02 abbeycode