abseil-cpp icon indicating copy to clipboard operation
abseil-cpp copied to clipboard

Abseil is not -Wsign-conversion clean

Open jfroy opened this issue 7 years ago • 16 comments

https://github.com/abseil/abseil-cpp/blob/master/absl/types/span.h#L282

jfroy avatar Nov 03 '17 00:11 jfroy

This poses a bigger question of whether Abseil should be -Wsign-conversion clean. I can easily write a static_cast here, but there are still a bunch of other warnings

absl/strings/escaping.cc: In function 'int absl::{anonymous}::hex_digit_to_int(char)': absl/strings/escaping.cc:67:32: warning: conversion to 'unsigned char' from 'char' may change the sign of the result [-Wsign-conversion] assert(absl::ascii_isxdigit(c)); ^ absl/strings/escaping.cc: In function 'bool absl::{anonymous}::CUnescapeInternal(absl::string_view, bool, char*, ptrdiff_t*, std::string*)': absl/strings/escaping.cc:132:32: warning: conversion to 'unsigned int' from 'int' may change the sign of the result [-Wsign-conversion] unsigned int ch = *p - '0'; ^ absl/strings/escaping.cc:133:71: warning: conversion to 'unsigned int' from 'const char' may change the sign of the result [-Wsign-conversion] if (p < last_byte && is_octal_digit(p[1])) ch = ch * 8 + *++p - '0'; ^ absl/strings/escaping.cc:135:30: warning: conversion to 'unsigned int' from 'const char' may change the sign of the result [-Wsign-conversion] ch = ch * 8 + ++p - '0'; // now points at last digit ^ absl/strings/escaping.cc:139:55: warning: conversion to 'std::basic_string::size_type {aka long unsigned int}' from 'long int' may change the sign of the result [-Wsign-conversion] std::string(octal_start, p + 1 - octal_start) + ^ absl/strings/escaping.cc:148:46: warning: conversion to 'size_t {aka long unsigned int}' from 'ptrdiff_t {aka long int}' may change the sign of the result [-Wsign-conversion] memcpy(d, octal_start, octal_size); ^ absl/strings/escaping.cc:160:48: warning: conversion to 'unsigned char' from 'char' may change the sign of the result [-Wsign-conversion] } else if (!absl::ascii_isxdigit(p[1])) { ^ absl/strings/escaping.cc:166:60: warning: conversion to 'unsigned char' from 'char' may change the sign of the result [-Wsign-conversion] while (p < last_byte && absl::ascii_isxdigit(p[1])) ^ absl/strings/escaping.cc:168:51: warning: conversion to 'unsigned int' from 'int' may change the sign of the result [-Wsign-conversion] ch = (ch << 4) + hex_digit_to_int(++p); ^ absl/strings/escaping.cc:171:69: warning: conversion to 'std::basic_string::size_type {aka long unsigned int}' from 'long int' may change the sign of the result [-Wsign-conversion] error = "Value of \" + std::string(hex_start, p + 1 - hex_start) + ^ absl/strings/escaping.cc:180:42: warning: conversion to 'size_t {aka long unsigned int}' from 'ptrdiff_t {aka long int}' may change the sign of the result [-Wsign-conversion] memcpy(d, hex_start, hex_size); ^ absl/strings/escaping.cc:194:53: warning: conversion to 'std::basic_string::size_type {aka long unsigned int}' from 'long int' may change the sign of the result [-Wsign-conversion] std::string(hex_start, p + 1 - hex_start); ^ absl/strings/escaping.cc:200:42: warning: conversion to 'unsigned char' from 'char' may change the sign of the result [-Wsign-conversion] if (absl::ascii_isxdigit(p[1])) { ^ absl/strings/escaping.cc:201:57: warning: conversion to 'unsigned int' from 'int' may change the sign of the result [-Wsign-conversion] rune = (rune << 4) + hex_digit_to_int(++p); // Advance p.

...

and so on for a while. I think this is the reason that we don't use this option internally -- by the time we started compiling with -Werror there were way too many of these to fix.

JonathanDCohen avatar Nov 03 '17 15:11 JonathanDCohen

As a simple experiment to look at the size of the problem, which is large:

$ bazel clean; bazel build absl/... --copt=-Wsign-conversion 2>/tmp/sign-warnings
$ wc /tmp/sign-warnings -l
5466 /tmp/sign-warnings

$ bazel clean; bazel build absl/... 2>/tmp/nosign-warnings
$ wc /tmp/nosign-warnings -l
15 /tmp/nosign-warnings

JonathanDCohen avatar Nov 03 '17 15:11 JonathanDCohen

Like I said internally - I could swear we did some of the sign conversion cleanup before release, so I'm surprised that this is busted. But at the same time, with those counts, it's gonna be a big silly/messy cleanup to resolve it all. I'm sorta torn.

tituswinters avatar Nov 03 '17 15:11 tituswinters

We have a -Weverything -Werror -Wno-... codebase, so we're used to dealing with these sorts of issues. Our blanket hammer is to flag a subtree as system headers. In some cases we'll add push/pop diagnostics. tl;dr; we can work around this for sure and it's not blocking us. Perhaps making the headers clean may be worth considering, if not too big an effort.

I am not an expert, but I assume this conversion to saturate npos is not UB?

jfroy avatar Nov 03 '17 16:11 jfroy

@jfroy - As a library that expects to be at/near the bottom of the dependency stack, I think we ought to be as warning free as possible. So I'd like to say we should just clean it up ... but bleh.

Wraparound on unsigned ints is defined, so the single item that motivated this is "fine". Largely my hesitation is that the vast majority of the hits on this warning are false-positives. (No good answer.)

tituswinters avatar Nov 03 '17 16:11 tituswinters

Perhaps this should be marked as low hanging fruit for the community to help out while you guys can spend time on more impactful matters.

brenoguim avatar Nov 03 '17 16:11 brenoguim

@brenoguim to clarify Titus' statement, we probably don't want to litter Abseil with static_casts everywhere if most of them are to clear up false positives from the compiler. Automatically compiling Abseil TUs with +Wsign-conversion or -Wnosign-conversion (can't remember off the top of my head how to disable a warning) seems wrong too, however.

JonathanDCohen avatar Nov 03 '17 16:11 JonathanDCohen

Oh yes, of course. I just figured there might be ways to rework the code so these conversions do not happen or happen in such way the compiler doesn't complain. But I haven't looked into the scenarios, so I wouldn't know. Also it's not always possible or would require worse code juggling than simply adding the static_cast.

brenoguim avatar Nov 03 '17 16:11 brenoguim

absl/strings/escaping.cc:67:32: warning: conversion to 'unsigned char' from 'char' may change the sign of the result [-Wsign-conversion] assert(absl::ascii_isxdigit(c));

This one already has a static_cast on the following line (escaping.cc:68), so it could be possible to rework the code a bit and reuse that static_cast.

absl/strings/escaping.cc: In function 'bool absl::{anonymous}::CUnescapeInternal(absl::string_view, bool, char*, ptrdiff_t*, std::string*)': absl/strings/escaping.cc:132:32: warning: conversion to 'unsigned int' from 'int' may change the sign of the result [-Wsign-conversion] unsigned int ch = *p - '0'; ^ absl/strings/escaping.cc:133:71: warning: conversion to 'unsigned int' from 'const char' may change the sign of the result [-Wsign-conversion] if (p < last_byte && is_octal_digit(p[1])) ch = ch * 8 + *++p - '0'; ^ absl/strings/escaping.cc:135:30: warning: conversion to 'unsigned int' from 'const char' may change the sign of the result [-Wsign-conversion] ch = ch * 8 + *++p - '0'; // now points at last digit

These other three could use a function to avoid repeating *p - '0'.

But anyway, I'm certain it's not possible to rework all of them, but some can, and may even lead to slightly more readable code.

brenoguim avatar Nov 03 '17 16:11 brenoguim

Some of these will be super noisy to clean up, given that some large code bases use -funsigned-char (cough cough), and depending on that these warnings will trigger one way or the other.

r4nt avatar Nov 06 '17 09:11 r4nt

This can't be the only library to suffer from this. I wonder what other people are doing. Boost seems to have a "numeric_cast" which is a static_cast with bound checking. But I'm not sure if they use internally on their own code.

Despite being a silly issue and potentially with many false positives, it sparked my interest. I don't want to believe it's not possible to write readable and warning free C++.

brenoguim avatar Nov 06 '17 15:11 brenoguim

It is generally impossible to write readable warning free C++ for an arbitrary definition of warning. Usually compilers do not warn on system headers that are included via -isystem. That way, people can switch on whatever warnings they think make sense in their own projects without hurting downstream users from switching on their warnings.

r4nt avatar Nov 06 '17 15:11 r4nt

Yeah, we use the -isystem for the boost library as well. I agree it's not possible to do a completely warning free code. On the other hand, I believe it's possible to do better. Just "giving up" on these warnings could also lead to less readable code and hide bugs.

Anyway, I'll stop guessing and try to come up with a PR that shows my point. :)

brenoguim avatar Nov 07 '17 15:11 brenoguim

@tituswinters I propose that we follow @brenoguim's idea and mark this as "help wanted" and "good first issue" and expand the scope of the issue to just in general being about when Abseil code emits compiler warnings. Most of these are easy to fix and the only decider is whether or not the fix is something we want to accept. Furthermore a lot of these are hard to test for internally where we have tightly controlled compiler flags in our toolchain.

JonathanDCohen avatar Nov 09 '17 16:11 JonathanDCohen

or you can/should just test it on kokoro once the export from Google code base is done ;)

Mizux avatar Mar 28 '18 09:03 Mizux

@Mizux I'm not sure what you're suggesting. We can't have a kokoro test for every possible compiler configuration. Could you clarify?

JonathanDCohen avatar Mar 28 '18 15:03 JonathanDCohen