Leanify icon indicating copy to clipboard operation
Leanify copied to clipboard

Is there any option to improve performace in multi-core envir.

Open nullptr-leo opened this issue 9 years ago • 25 comments

How can I make use of all the CPU cores? Is there any simple way?

nullptr-leo avatar Jan 20 '15 12:01 nullptr-leo

Not for now, but this is definitely on my list. The current bottleneck is zopfli algorithm, but zopfli itself does not support multi core. What I could do is, when there are more than one file given, I can process them in parallel, but the problem is the output will be a mess.

JayXon avatar Jan 20 '15 22:01 JayXon

Leanifying PNG takes a long time, as well as DOCX files containing PNG. So is there any option to skip leanifying the PNG?

nullptr-leo avatar Jan 21 '15 05:01 nullptr-leo

docx is slow because it's a zip file, and it uses zopfli to optimize. You can speed up leanify by decreasing iterations or using fast mode, but the result will be larger than default option.

JayXon avatar Jan 21 '15 05:01 JayXon

@JayXon Exactly, can I leanify all files in DOCX(ZIP) except PNG? I create an DOCX document, and insert 3 images: 1.JPG(190K), 2.PNG(595K), 3.JPG(570K). When I leanify this DOCX, step leanifying "word/media/image2.png" occupies most of the time (~80%). That is to say, leanifying PNG in DOCX or ZIP may be the bottleneck. So ignoring PNG is an easy way to improve performance.

Another question, interruptting the process of leanifying a DOCX may damage the file.

nullptr-leo avatar Jan 21 '15 06:01 nullptr-leo

It still makes the result larger though. If I implement this, I could add options like --disable-png. I'll think about it.

Yes, interrupting leanify may damage the file because it's using file mapping.

JayXon avatar Jan 21 '15 07:01 JayXon

The more general solution is to use a file type filter. The option could be -i png. Anyway, there is no need to add these options, if the PNG leanifying performance can be improved :)

nullptr-leo avatar Jan 21 '15 08:01 nullptr-leo

If I add --disable-png, I'll add other types like --disable-xml too. -i is already taken. Currently png performance depend hugely on ZopfliPNG but the author is very busy.

JayXon avatar Jan 21 '15 09:01 JayXon

I think multi-process couple of files would be easiest way to improve performance without rebuild internal algorithms

ertyz avatar May 03 '17 18:05 ertyz

I considered processing multiple files at the same time if more than one file is given to leanify or a zip file has multiple files inside but the problem with that is stdout will be a mess. It's actually easier to just call leanify multiple times on different files.

JayXon avatar May 03 '17 19:05 JayXon

The current bottleneck is zopfli algorithm, but zopfli itself does not support multi core.

@JayXon An optimized version like @MrKrzYch00's fork is multi-threaded & would solve that for you.

TPS avatar May 26 '17 02:05 TPS

His fork is using pthread which is not standard C/C++, so it won't compile in Visual Studio.

JayXon avatar May 26 '17 03:05 JayXon

https://stackoverflow.com/questions/28975700/how-to-add-pthread-library-to-c-project-in-visual-studio-community-edition seems to have a solution for that, but I'm unsure if that would introduce the need for a DLL dependency, or could be done for leanify's monolithic executable.…

TPS avatar May 26 '17 05:05 TPS

Maybe leanify.exe can run as a host or dispatcher when several file path arguments passed to it, which will start several sub-processes with console processing each file, to make full use of CPU cores.

nullptr-leo avatar May 27 '17 13:05 nullptr-leo

@JayXon now it uses WIN32 threads. However I'm not sure if HANDLE, DWORD WINAPI, CreateThread, SetThreadAffinityMask is supported the same way on Visual Studio as it is in GCC's windows.h. If it is and VS can handle "#ifdef _WIN32" the same way GCC does then it may be able to compile? Also there weren't a lot of changes needed to be done to have windows exception for this, at least in GCC.

Said so, my multi-threading may be kind of messy... but it works! :)

Also beware of --all, it's actually not supported by my fork and may be removed as I think it is the cause of some memory leaking going on. --testrec and using all modes separately might be superior (needs verification), which is also multi-threaded starting from v18.2.13.

MrKrzYch00 avatar Feb 27 '18 16:02 MrKrzYch00

At least, you can fork() on PNGs in .docx|.pptx|(.pdf). Every proccessed PNG only outputs one line to stdout, so it wont make a mess and you would waitpid() for all threads before finishing up the file.

vitexikora avatar May 11 '18 16:05 vitexikora

I'm working on multicore leanifying using https://github.com/taskflow/taskflow library.

Everything is working fine now, but I need more time to cleanup code and fix console outputs. Because mess in concole due to parallel writes. I hope I will prepare pull request before weekend.

doterax avatar Mar 10 '21 11:03 doterax

Everything is merged. You can use it with -p option in command line. Fresh builds you can find in nightly builds section.

doterax avatar Mar 15 '21 23:03 doterax

Anyone got any benchmarks?

@JayXon Any plans for official binaries for such a monumental release?

TPS avatar Mar 16 '21 00:03 TPS

I still need to update some libraries and I'll cut a new release after that.

Do note that this is only parallel on the file level, so it only works if you pass leanify multiple files or a directory.

JayXon avatar Mar 16 '21 02:03 JayXon

@TPS did you try nightly builds for parallel leanifying?

doterax avatar Mar 28 '21 15:03 doterax

@doterax No, I use leanify solely via various FileOptimizer toolchains, which run 1 file @ a time, so this improvement would do nothing there currently.

TPS avatar Mar 28 '21 18:03 TPS

@JayXon you can use external oxipng for very fast multi threaded PNG optimization:

https://github.com/shssoichiro/oxipng

murlakatamenka avatar Apr 30 '22 04:04 murlakatamenka

@JayXon you can use external oxipng for very fast multi threaded PNG optimization:

https://github.com/shssoichiro/oxipng

Optipng and it's Rust's port oxipng use deflate as a compression algorithm. Leanify, in other hands, uses zopfly as a compression algorithm with a better compression results.

doterax avatar May 02 '22 12:05 doterax

Okay, as I understand the minimal size of the produced artifact is more important than time spent for optimization, thus zopfli would be better despite taking drastically more time.

murlakatamenka avatar Jun 01 '22 05:06 murlakatamenka

Okay, as I understand the minimal size of the produced artifact is more important than time spent for optimization, thus zopfli would be better despite taking drastically more time.

You can use --parallel switch to distribute processing of files between CPUs. But this doesn't work for a single file. You should provide a batch of files to process.

doterax avatar Oct 13 '22 19:10 doterax