cppcheck Fix #10412 FN useStlAlgorithm with iterators

May 31 '22 14:05 chrchr-github

Sorry for going off-topic but we really need to take a strong look at this check. We currently do not apply these findings to our code and if we are not able to dogfood it we cannot expect others to use it.

IIRC correctly there are various issues (these also apply to readability-use-anyofallof and related checks from clang-tidy):

the readability might be worse
a conversion is not straight forward or not even possible at all
it might actually perform worse (unfortunately several C++ implementations are slower than the raw or C ones)

May 31 '22 19:05 firewave

Sorry for going off-topic but we really need to take a strong look at this check. We currently do not apply these findings to our code and if we are not able to dogfood it we cannot expect others to use it.

Then maybe we should enable it? There are some pretty verbose and ugly loops in our code base...

IIRC correctly there are various issues (these also apply to readability-use-anyofallof and related checks from clang-tidy):

the readability might be worse

I agree, especially in relation to std::transform.

a conversion is not straight forward or not even possible at all

Then that's a FP.

it might actually perform worse (unfortunately several C++ implementations are slower than the raw or C ones)

This could also be called an optimization/implementation bug.

May 31 '22 20:05 chrchr-github

We currently do not apply these findings to our code and if we are not able to dogfood it we cannot expect others to use it.

We should. I am not sure the reason why. Perhaps we need generic lambdas or something?

I use it on my code bases and it works quite well. And I have had emails from users that found it quite useful. It helped to improve readability and even found some bugs.

these also apply to readability-use-anyofallof and related checks from clang-tidy

readability-use-anyofallof just checks for escape and non-mutation, which can find loops that might not be straighforward to transform.

the readability might be worse

It is confusing when there is transform_reduce, since older C++ versions does not provide this, and technically you can use std::accumulate.

However, overall it does improve readability, and you can fix the other issues by providing your own transform_reduce.

a conversion is not straight forward or not even possible at all

That maybe applicable to clang-tidy, but I dont that is the case here. Plus, I haven't seen an issue like this either.

May 31 '22 21:05 pfultz2

Then maybe we should enable it? There are some pretty verbose and ugly loops in our code base...

+1 We should also enable the clang-tidy checks and take a look at them. And profile the version with the changes to see if there is some impact at all. After the past few months I don't trust C++ (convenience) anymore...

I agree, especially in relation to std::transform.

Maybe we should drop these or make them "inconclusive"?

Then that's a FP.

From experience (and faint memory) the findings make sense but trying to fix some of them will drive you mad or the code is simply not the same or it just doesn't work. That could just be the ugly std::transform ones though...or ones from clang-tidy - it's been a while...

it might actually perform worse (unfortunately several C++ implementations are slower than the raw or C ones)

This could also be called an optimization/implementation bug.

Yes.

std::clamp() (also a pattern to detect) - https://trac.cppcheck.net/ticket/10693 / https://github.com/llvm/llvm-project/issues/55098

And no.

std::string::find_first_of(): - https://trac.cppcheck.net/ticket/10777 std::copy() - https://trac.cppcheck.net/ticket/10706

It seems there are certain restrictions for C++ algorithms which do not apply to C implementations and obviously not to self-rolled implementations.

May 31 '22 21:05 firewave

readability-use-anyofallof just checks for escape and non-mutation, which can find loops that might not be straighforward to transform.

Probably something to report upstream if we come across it in our code.

May 31 '22 21:05 firewave

That could just be the ugly std::transform ones though

Whats the problem with std::transform?

std::string::find_first_of(): - https://trac.cppcheck.net/ticket/10777

We dont suggest this.

std::copy() - https://trac.cppcheck.net/ticket/10706

We suggest std::copy as it is the safest choice, and it will work functionally the same as the loop. Users can still replace it with unsafe versions if it proves to be faster.

That doesn't seems to be reason enough to stop suggesting std::copy.

Jun 01 '22 03:06 pfultz2

std::string::find_first_of(): - https://trac.cppcheck.net/ticket/10777

We dont suggest this.

This was a list of examples where modern code is not an improvement.

std::regex (unfortunately no simple C or raw counterpart) and std::stringstream also come to mind.

We suggest std::copy as it is the safest choice, and it will work functionally the same as the loop. Users can still replace it with unsafe versions if it proves to be faster.

We do suggest that? I wasn't aware of that.

That doesn't seems to be reason enough to stop suggesting std::copy.

In the past I had a slow function which was in the red hot path which was using that function several times (we modernized it from memcpy to std::copy at some point) - there were bigger offenders. Somehow I didn't see or missed that it made it slower.

I think if is there is a known trade-off (like security for performance) that should be documented. But in case of std::copy it also changes the behavior since you essentially replace memcpy with memmove.

Or if you use an algorithm that improves readability but suddenly uses double-loops (like std::string::find_first_of()) or goes from a linear to exponential growth it is not a "safe" replacement since it also changes the behavior.

Similar to range-based for loop usage based on modernize-loop-convert when const_iterator is involved. The resulting loop will still have a const object but will utilize a regular iterator which might break if that is not available code or change the behavior.

modernize-use-equals-default is another example as = default changes the behavior and can lead to code breakage since it is not the same as an empty implementation.

Unlike clang-tidy there's no way to fine-tune such checks with Cppcheck - which I think is a good thing but we also need to be careful what we report.

If we need to have performant code (which you always should have) and suddenly need to plaster code with suppressions you won't be happy about it. clang-tidy recently added several checks which you simply cannot enable since it fires for generic pattern which might cause problem but are actually extremely unlikely to do so and need very careful review and possibly lots of suppressions. They seem more like some very specific internal requirement built as a clang-tidy check.

You should never blindly just change your code but if a finding results in a major headscratcher even for quite experienced devs it might be the best to report it.

We really need to open the discussion section of the repo so tickets are not being hijacked for discussions...@danmar

Jun 01 '22 06:06 firewave

That could just be the ugly std::transform ones though

Whats the problem with std::transform?

    for (auto& i : v)
        i += 5;

vs.

    std::transform(v.begin(), v.end(), v.begin(), [](auto i) {
        return i + 5;
    });

doesn't seem to improve readability.

Jun 01 '22 08:06 chrchr-github

    std::transform(v.begin(), v.end(), v.begin(), [](auto i) {
        return i + 5;
    });

doesn't seem to improve readability.

This also seems prone to messing up the parameter order - especially in the four parameter version.

The first example on https://en.cppreference.com/w/cpp/algorithm/transform is particular cringeworthy...

Jun 01 '22 09:06 firewave

In the past I had a slow function which was in the red hot path which was using that function several times (we modernized it from memcpy to std::copy at some point)

We dont suggest to use std::copy instead of memcpy, so I am not sure how this is relevant.

clang-tidy recently added several checks which you simply cannot enable

Also, I dont see how all these clang-tidy issues are relevant here at all, since we do not have these equivalent "modernize" checks.

This also seems prone to messing up the parameter order - especially in the four parameter version.

Which is not relevant here since we never suggest std::transform when the four parameter version is needed.

The first example on https://en.cppreference.com/w/cpp/algorithm/transform is particular cringeworthy...

I dont find it cringeworthy. It would be better to make it more explicit its doing a transform in-place, that could improve readability. However, it still makes the intention of the code quite clear. Plus its easy to make it run in parallel.

Jun 01 '22 16:06 pfultz2

I dont find it cringeworthy. It would be better to make it more explicit its doing a transform in-place, that could improve readability. However, it still makes the intention of the code quite clear. Plus its easy to make it run in parallel.

A parallel_for would have been sufficient...

Jun 01 '22 16:06 firewave

Imho the stl algorithms are not that readable. It depends on the reader.

for (const std::string & p: suffixes) {
    if (isPrefixStringCharLiteral(str, q, p))
        return true;
}

=>

return std::any_of(suffixes.begin(), suffixes.end(), 
    [q](const std::string& p) {
        return isPrefixStringCharLiteral(str, q, p);
});

It would be more readable if the replacement was more like:

return any_of(suffixes, p, isPrefixStringCharLiteral(str,q,p));

Jun 03 '22 17:06 danmar

It would be more readable if the replacement was more like:
return any_of(suffixes, p, isPrefixStringCharLiteral(str,q,p));

LLVM has such helpers for many algorithms. Probably because of the readability among other things.

Jun 03 '22 17:06 firewave

C++20 improves things a little:

bool f(const std::vector<std::string>& v) {
    return std::ranges::any_of(v, [](const auto& s) { return s.empty(); });
}

Jun 03 '22 18:06 chrchr-github

But this PR is still about flagging iterators, which also make for unreadable code.

Jun 03 '22 18:06 chrchr-github

It would be more readable if the replacement was more like:

return any_of(suffixes, p, isPrefixStringCharLiteral(str,q,p));

I dont that would be more readable as it not clear that you are declaring p as a variable.

Do you think a range-based any_of is less readable?

return any_of(suffixes, [&](const std::string& p) {
    return isPrefixStringCharLiteral(str, q, p);
});

This doesn't seem anymore verbose than the original for loop plus it makes the intention clearer.

HOFs could make it even more succinct. Such as boost::hof::partial:

return any_of(suffixes, partial(&isPrefixStringCharLiteral)(str, q));

Or boost::hof::lazy:

return any_of(suffixes, lazy(&isPrefixStringCharLiteral)(str, q, _1));

Which is like std::bind so it could be written like this in C++11:

return any_of(suffixes, std::bind(&isPrefixStringCharLiteral, str, q, _1));

Since we dont have C++20 yet to use a range-based any_of we could add simple wrappers around it:

template <class C, class Predicate>
bool any_of(const C& c, const Predicate& p)
{
    return std::any_of(c.begin(), c.end(), p);
}

Although, the goal of this check is not just readability. Its to find bugs in code. If the intention of the loop is not any_of then there is a mistake in the code(and I have made this mistake in the past). Of course it could be a little more verbose(for iterator-based algorithms at least), but at the same time its more verbose to use const or explicit(which we warn about). The readability improvement comes from making the intentions of the code clear, and if that wasn't the intention of the code then cppcheck has found a bug.

Jun 03 '22 18:06 pfultz2

But this PR is still about flagging iterators, which also make for unreadable code.

This true, using the algorithm in this case is not anymore verbose than the original loop.

Jun 03 '22 18:06 pfultz2

Do you think a range-based any_of is less readable?

That is much better. I guess the readability for that code is similar to the readability of the for loop to my eyes. It is annoying that the lambda stuff is needed.

Those boost::hof and std::bind examples are confusing and not readable at all to my eyes. :-) So the lambda is better than those options.

Jun 03 '22 19:06 danmar

That is much better. I guess the readability for that code is similar to the readability of the for loop to my eyes.

Sure, I guess its a matter of opinion, although I disagree. I think its fine we disable this for cppcheck code then, but I dont see that reason enough to remove it from cppcheck since it can find bugs and is highly recommended from many prominent C++ developers.

It is annoying that the lambda stuff is needed.

Yea, I dont see it as annoying. In fact, since I have started using stl algorithms on a regular basis especially with cppcheck scolding me about it, it has become quite natural to write it this way. Perhaps others may not be as familiar with algorithms or lambdas so would not see this as better.

We have flags about which C++ version to use, and we only warn about the loops with C++14 or later. Perhaps we could have flags about certain feature users would like to avoid or are less familiar with so we could adjust the checks based on that.

Those boost::hof and std::bind examples are confusing and not readable at all to my eyes.

Sure, another idea is to assign the lambda to a variable:

auto hasStrPrefix = [&](const std::string& p) {
    return isPrefixStringCharLiteral(str, q, p);
};
return any_of(suffixes, hasStrPrefix);

Jun 06 '22 16:06 pfultz2

I think if is there is a known trade-off (like security for performance) that should be documented. But in case of std::copy it also changes the behavior since you essentially replace memcpy with memmove.

I assume its easier to find places to replace std::copy with memcpy then trying to find raw loops. Of course, they can't be blindly replaced so an advanced developer that understand the constraints can make the memcpy replacement(we could add a check for this as well in the future).

If its very important to use memcpy or strbk, then it would be important to warn about raw loops like we do, so an advanced developer who understand the code and constraints can replace the raw loop with the unsafe version.

I am sure if you see cppcheck complaining about a raw loop needing to use std::copy then you will consider memcpy if its possible. So ultimately, it seems like an argument to keep this check.

Jun 06 '22 16:06 pfultz2

If its very important to use memcpy or strbk, then it would be important to warn about raw loops like we do, so an advanced developer who understand the code and constraints can replace the raw loop with the unsafe version.

There's some important keywords in here: "advanced", "constraints" and "unsafe".

I personally was not not aware of some of these constraints at all and also that those might actually be for for safety reasons (and it seems some of my former colleagues weren't either). It just seemed like a modern replacement.

And making code easier/safer by default is fine by me. The tradeoff should be reasonable though. Maybe some notes/clarification about this in the documentation might not be the worst thing.

We should also consider https://github.com/danmar/cppcheck/blob/main/philosophy.md. Still writing "raw" (C version of something available in C++ / self-rolled algorithm) code which is by "design" (e.g. more performant) "safe" (has been peer reviewed) being flagged might an annoyance - but it can be suppressed which would also serve as annotation for this "unsafe" code. So that should be fine for an "advanced developer".

I might have missed my points here though. 😴

So ultimately, it seems like an argument to keep this check.

Well, my main argument was about "dogfooding". I didn't get around yet to have a look what we actually warn about it. I was just working from memory. The rest was just me straying off into the woods. 🐇🕳

Jun 06 '22 17:06 firewave

Any more feedback on this?

Jul 11 '22 18:07 chrchr-github

Any more feedback on this?

My opinion is still the same. Looking at the code I also wonder if this might impact the performance.

Jul 11 '22 19:07 firewave

So I made this mergeable once more. How about it?

Oct 12 '22 18:10 chrchr-github

cppcheck cppcheck copied to clipboard

Fix #10412 FN useStlAlgorithm with iterators

cppcheck
cppcheck copied to clipboard