rpmlint icon indicating copy to clipboard operation
rpmlint copied to clipboard

Support skipping warning checks and only evaluate checks that result in errors

Open Conan-Kudo opened this issue 7 years ago • 7 comments

In cases where rpmlint is operating as a post-build check, sometimes the load from warnings is too much to consume. We should have a capability to just only run critical checks that result in errors so that rpmlint is faster and gives more focused feedback in those cases.

Conan-Kudo avatar Jun 07 '18 12:06 Conan-Kudo

I tried to kinda draw the logic how it could behave in order for the threading to work:

20180712_145415

Basic logic is to have testIDs prior we start doing anything, then load all data to memory and in parallel run all area checks and in them all individual tests that are enabled in threaded mode.

scarabeusiv avatar Jul 12 '18 12:07 scarabeusiv

That sounds pretty solid and should speed things up considerably. I'm guessing there'd be something to queue up the results so they can be printed in some kind of ordered fashion?

Conan-Kudo avatar Jul 12 '18 15:07 Conan-Kudo

That could be done in a buffer you store all the errors and results to be printed just at the end.

scarabeusiv avatar Jul 12 '18 17:07 scarabeusiv

Also with this layout we can have every distro related thing in here while it can be disabled/adopted by the other distributions :)

scarabeusiv avatar Jul 12 '18 17:07 scarabeusiv

We should also optimize the extractor as we spent a lot of time unpacking content from cpio archives.

scarabeusiv avatar Jul 12 '18 19:07 scarabeusiv

I'm trying to understand the primary objective here. It seems it is performance, e.g. reducing runtime in certain contexts, e.g. in our post build validation.

In my python profile runs, the runtime of more than >90% of the individual tests is in the noise. The bulk of time is spent on rpm unpacking, file(1)/magic determination and all the binariesdump stuff like the objdump madness. My suggestion is that we separate those in a category (currently implementation that would be an AbstractCheck inherited class that does one or more checks that are based on a similar class of errors ). Some of the runtime intensive checks are fairly obscure (like the exit/getaddrinfo) and could be removed in the current model by breaking them out and making them removable. I'd be in favor of removing them then from the openSUSE build.

My concern with this is that it is less flexible. For example ignoring 'i want to ignore static files in packages called ocaml- if they're hitting for files under /usr/lib/ocaml' is possible with the current filtering system.

In normal source code linters such exceptions can be done by annotating the source code. With binaries that isn't possible. Hence the need for a very flexible filtering, with some downsides on performance, which we have to accept imho.

What could be done instead is specifying that certain large files or dirs should be skipped from execution on checks or that we run each AbstractCheck instance in a separate the thread. I tried that a few years ago and didn't succeed in a runtime improvement but it's worth trying it out again.

dirkmueller avatar Jul 12 '18 20:07 dirkmueller

An area worth investigating is whether we can parallelize the rpm unpacking or file(1) determination. That would have quicker return on investment.

dirkmueller avatar Jul 12 '18 20:07 dirkmueller