cppcheck fix #12771: Progress value is not increased

Nov 11 '24 09:11 ludviggunne

I have similar code which deals with the language. I guess the basic logic can be re-used/combined from the changes.

Nov 12 '24 13:11 firewave

This also needs tests for the cases the progress was not correctly reported.

Nov 13 '24 10:11 firewave

I have similar code which deals with the language. I guess the basic logic can be re-used/combined from the changes.

Hm, I'm not sure what you mean, do you have a PR for it?

Nov 14 '24 13:11 ludviggunne

Hm, I'm not sure what you mean, do you have a PR for it?

Not yet - #6978 is in preparation of the change I referred to. But I am not too sure about it - the context-less usages (i.e. no file, no language) which will only lead to issue with specific usage are giving me headaches. Maybe I use your use case to proceed with this.

Nov 16 '24 15:11 firewave

FYI when we need to test something related to inputs there is three different cases to consider:

files specified via CLI
files specified via a .cppcheck project (basically emulates the CLI options but sometimes has its own logic)
any other project (for ease that would be a compilation database)

Nov 20 '24 12:11 firewave

FYI when we need to test something related to inputs there is three different cases to consider:

Okay, thanks! I'll be focusing on other tickets for a while so I'll pick this up at a later time.

Nov 23 '24 16:11 ludviggunne

I'll be focusing on other tickets for a while so I'll pick this up at a later time.

Don't end up like me... 🙄

Nov 25 '24 14:11 firewave

I would prefer to have the code which gets the filesize in path.cpp.

Dec 20 '24 21:12 firewave

I noticed the percentage reported in SingleExecutor for FileSettings is based on the number of files instead of the file sizes, is this intentional?

https://github.com/danmar/cppcheck/blob/85eadd8572a05c320ab972668fc2765edd6447be/cli/singleexecutor.cpp#L67

Dec 22 '24 17:12 ludviggunne

I noticed the percentage reported in SingleExecutor for FileSettings is based on the number of files instead of the file sizes, is this intentional?

No idea - I would need to look at the history.

But I wonder why we calculate it based on on the file size anyways. That makes no sense because we are using the size of the actual input file. A file with just a few lines itself might end up with much more code which is being analyzed. So that would only make sense if we used the size of preprocessed code.

I think we should just consider the amount of files.

Dec 22 '24 17:12 firewave

I think we should just consider the amount of files.

That makes sense. I'm not sure how the workflow would look for that, should it be it's own ticket?

Dec 22 '24 18:12 ludviggunne

But I wonder why we calculate it based on on the file size anyways. That makes no sense because we are using the size of the actual input file. A file with just a few lines itself might end up with much more code which is being analyzed. So that would only make sense if we used the size of preprocessed code.

Originally it was number of files..

Then at some point we were discussing some project in the chat. I was running Cppcheck on the project. I had checked a number of the files. So the report said "X% done". And in the chat I said something like "I have checked X% already but well there are big files remaining so probably it's going to take longer time." Then somebody in the chat suggested that we would base "X% done" on the file sizes.

It's still a very rough estimate. As you pointed out it depends a lot on what headers are included. And it can depend a lot on how many branches etc there are. A huge file with trivial code can run quickly.

I don't have super strong opinion.. but generally.. based on number of files the guess will be off more imho. A small file would generally go faster to analyse than a large file.

Dec 27 '24 09:12 danmar

A small file would generally go faster to analyse than a large file.

That is my "best guess".. I can't prove it. :-)

Dec 27 '24 10:12 danmar

I don't have super strong opinion.. but generally.. based on number of files the guess will be off more imho. A small file would generally go faster to analyse than a large file.

Would it be possible to base it on the length of the token list? Anyway, I suppose we can stick to filesizes in this PR?

Dec 27 '24 11:12 ludviggunne

Would it be possible to base it on the length of the token list?

nah.. I believe it would be more accurate but that would increase the analysis time. we would need to open all the files and count all the tokens to know the total number of tokens.

Dec 27 '24 13:12 danmar

The value is arbitrary and inaccurate either way - so I would go with the simpler implementation i.e. file count only. It would also match the default ('-j1') behavior.

Dec 27 '24 13:12 firewave

I guess we actually have both right now, with 1/3 files checked 42%.

Dec 29 '24 15:12 ludviggunne

I guess we actually have both right now, with 1/3 files checked 42%.

Fair enough.

Dec 30 '24 16:12 firewave

I guess we actually have both right now, with 1/3 files checked 42%.

Fair enough.

I still think we should drop the file size percentage. Looking at how much time and discussion has already been put into this and it is something a high percentage of the users will probably never see as they use headless systems (i.e. CI/IDE integrations). It seems the time might be better spent on something else than making an arbitrary value arbitrary in a different way...

Jan 09 '25 12:01 firewave

I still think we should drop the file size percentage

I don't want to drop it.

Jan 10 '25 19:01 danmar

A different PR for this ticket has been merged.

Nov 15 '25 12:11 ludviggunne

Ooooooooh. This was never merged - no wonder I couldn't find the changes which I was referring to in https://github.com/danmar/cppcheck/pull/7956#issuecomment-3521535684.

BTW the other PR is #7956.

Nov 17 '25 13:11 firewave