paperwork
paperwork copied to clipboard
Zip the documents
It could be useful to actually zip each document:
- It would bring Paperwork closer to the way OpenDocuments work
- It would make documents transfer easier
- It would reduce the stress on the filesystem
Beware of #124
Will be redundant with #124
Yeah, #124 is not happening before a very long time, so let's go with this for now.
Should only be used on small documents (< 20 pages I guess). It would make image modifications really CPU-intensive / disk-intensive on big documents.
or maybe .tar.gz :)
I’m not sure this is a good idea:
- How will this put less stress on the filesystem? The overall size should be almost the same since PDF and JPEG images don’t compress very well.
- When a part of a document will be read, the whole zip/tgz will have to be read.
- Same for writing only a part (rotating a page, changing the labels…)
- It will make Paperwork even more slow than it is today.
How will this put less stress on the filesystem? The overall size should be almost the same since PDF and JPEG images don’t compress very well.
Less files --> less inodes ; + less modification time to check when Paperwork starts.
When a part of a document will be read, the whole zip/tgz will have to be read.
I need to check but I believe zip or tgz (or both) have indexes.
Same for writing only a part (rotating a page, changing the labels…)
Yes, I know, this is the main problem :/
It will make Paperwork even more slow than it is today.
Yes and no. It would make the start time much faster actually.
I had a look:
- .zip have an index at the end of the file : https://en.wikipedia.org/wiki/Zip_(file_format)#Structure
- .tar.gz don't have any index
So .zip might more indicated here. I gues it also explains why LibreOffice and Office both use them too.
Note that it would also help reduce fragmentation, which could improve documents load time:
- Ext* file systems try to keep single files in one single row. But files individually may be placed randomly on the hard drive
- When opening a multiple-pages document, Paperwork usually loads the page sequentially at first
However, I guess keeping the labels out of the .zip file could be a good idea. No need to rewrite X MB when all you want is just fix the labels that have been guessed.
That would be a reasonable compromise indeed.