libzippp icon indicating copy to clipboard operation
libzippp copied to clipboard

addData requires that data remains valid until close

Open flomnes opened this issue 1 year ago • 4 comments

Short description

When adding some data through ZipArchive::addData, the buffer read/file write is delayed until ZipArchive::close is called. This is a problem if the user wants to write the data immediately and free the underlying memory.

I couldn't find any workaround.

Example from my project

    // Add data to an existing ZipArchive
    filename.clear() << folder << SEP << "areas.txt";
    study.pZipArchive->addData(filename.c_str(),
                               out.c_str(),
                               out.size());
    // out is freed
    // Do some more stuff not related to out & libzippp
    pZipArchive->close(); // <= Valgrind indicated an invalid read, with some garbage characters in the corresponding entry

flomnes avatar Jul 17 '22 20:07 flomnes

I found a workaround, adding close() + open() to force the flush. Am I doing it right ? If so, this mechanism could be encapsulated into a new flush function.

flomnes avatar Jul 17 '22 20:07 flomnes

Hi @flomnes,

Unfortunately it is how the underlying libzip library works: the data is written when the zip is closed, which is reflected through libzippp. I don't see any problem with closing/reopening the ZipArchive, however I'll consider to add a flush method.

ctabin avatar Jul 25 '22 20:07 ctabin

@ctabin The problem is that in order to keep a valid archive, when adding data to an existing archive libzip creates a copy, writes to that copy and replaces the original. It operates that way to ensure that in case of error, the original will not be corrupted.

If the original is 5Gb and you want to add a few files, it becomes very expensive in terms of disk I/O. I haven't found a way to disable it. On the other hand, minizip-ng seems to write files immediately to the existing archive.

flomnes avatar Jul 26 '22 06:07 flomnes

@flomnes I met this same problem in my work, where total size may reach 10 GB making the fake flush unaccecpable. A possible workaround I adapted is writing that data to a temp file on disk, then call the addFile with string filename param.

Xiangze-Li avatar Nov 21 '23 02:11 Xiangze-Li