mbedtls icon indicating copy to clipboard operation
mbedtls copied to clipboard

Warnings from GCC 15 -Wunterminated-string-initialization

Open billatarm opened this issue 10 months ago • 12 comments

Summary

Some of the string literals use the size of the array to specify an array of size N where N is the length of the string and so things like sizeof() will truncate the NULL byte.

Warning message Example:

/builddir/build/BUILD/mbedtls3.6-3.6.2-build/mbedtls-3.6.2/library/ssl_tls13_keys.h:20:44: error: initializer-string for array of ‘unsigned char’ is too long [-Werror=unterminated-string-initialization]
   20 |     MBEDTLS_SSL_TLS1_3_LABEL(c_hs_traffic, "c hs traffic") \

Code Examples to reproduce:

char foo[3]="foo";
printf("%s-%zu\n", foo, sizeof(foo);
foo - 3

This waring can be enabled via: -Wunterminated-string-initialization

I have attached the build log.

build.log

System information

Mbed TLS version (number or commit id): 3.6.2 Operating system and version: F32 with gcc-15 Configuration (if not default, please attach mbedtls_config.h): Compiler and options (if you used a pre-built binary, please indicate how you obtained it): Additional environment information:

Expected behavior

Builds

Actual behavior

Steps to reproduce

Additional information

billatarm avatar Jan 30 '25 18:01 billatarm

I have patches if you want them.

billatarm avatar Jan 30 '25 18:01 billatarm

There is no undefined behavior here. MBEDTLS_SSL_TLS1_3_LABEL deliberately creates an array containing the string literal without a null byte. Arrays of bytes (unsigned char[]) can contain any byte value, including having all bytes be nonzero. Undefined behavior would only take place if such an array was passed to a string function strxxx() from <string.h> or other functions that expect a null-terminated string. The array constructed here is not a null-terminated string and is not passed to functions that expect a null-terminated string.

In this particular case, adding a null byte would not be compliant to the protocol specification.

This waring can be enabled via: -Wunterminated-string-initialization

Note — only with GCC 15, which is not released yet.

GCC 15's new warning -Wunterminated-string-initialization triggers many false positives on our code base. I had a quick look and none of them look like a true positive.

We do aim to compile with no warnings at a reasonable warning level with common compilers. This will include GCC 15 with -Wall -Wextra once GCC 15 is released.

Patches to avoid triggering the warning would be welcome. But they obviously need to be correct! And in most cases, for this warning, in our code base, the warning is a false positive. So we need to find a way to let GCC know that the array is intended to be a byte array and not a null-terminated string.

gilles-peskine-arm avatar Jan 30 '25 18:01 gilles-peskine-arm

There is no undefined behavior here. MBEDTLS_SSL_TLS1_3_LABEL deliberately creates an array containing the string literal without a null byte.

I understand the intention of the code. This wasn't discussed in C89 (undefined bahavior), but the C99 spec did cover it:

EXAMPLE 8 The declaration
char s[] = "abc", t[3] = "abc";
defines ‘‘plain’’ char array objects s and t whose elements are initialized with character string literals.
This declaration is identical to
char s[] = { 'a', 'b', 'c', '\0' },
t[] = { 'a', 'b', 'c' };
The contents of the arrays are modifiable. On the other hand, the declaration ...

We should file a bug to GCC if it's not filed yet.

billatarm avatar Jan 31 '25 14:01 billatarm

I would like to point out though, the original author wasn't so sure either of the behavior on this:

 struct mbedtls_ssl_tls13_labels_struct const mbedtls_ssl_tls13_labels =
 {
    /* This seems to work in C, despite the string literal being one
     * character too long due to the 0-termination. */

billatarm avatar Jan 31 '25 15:01 billatarm

This is well-defined behavior in all versions of the C language, even back to K&R C AFAIR. Using a string literal as an initializer adds a null byte if there is room. The array size can be left empty, in which case the size includes a terminating null byte.

The following examples illustrate array initialization (hopefully I didn't make any copypasta). All are fully defined in all versions of the C language.

    /* The following initialize 4 non-null bytes: */
    char explicit_braces_array[4] = {'a', 'b', 'c', 'd'};
    char implicit_braces_array[] = {'a', 'b', 'c', 'd'};
    char explicit_quotes_array[4] = "abcd";

    /* The following initialize 4 bytes, the last one being null: */
    char explicit_braces_array0[4] = {'a', 'b', 'c', 0};
    char explicit_braces_array_partial[4] = {'a', 'b', 'c'};
    char implicit_braces_array0[] = {'a', 'b', 'c', 0};
    char explicit_quotes_string[4] = "abc";
    char explicit_quotes_string0[4] = "abc\000";
    char implicit_quotes_string[] = "abc";

    /* The following initialize 4 bytes, the last two being null: */
    char explicit_braces_array00[4] = {'a', 'b', 0, 0};
    char explicit_braces_array_partial[4] = {'a', 'b'};
    char implicit_braces_array00[] = {'a', 'b', 0, 0};
    char explicit_quotes_string00[4] = "ab\000\000";
    char explicit_quotes_string_partial[4] = "ab";
    char implicit_quotes_string0[] = "ab\000";

The warning from gcc -Wunterminated-string-initialization doesn't say “your code has undefined behavior”. It says “your code has behavior that may be surprising”. I don't see any reason to file a bug with GCC: the warning makes sense. In this case, we intended the behavior. The only thing I might want to change in GCC is to have a well-defined mechanism to say “I intended this”, but that's a very general problem with warnings in C.

One way to avoid the warning would be to use braced initializers instead of string literals. However, here, it would mean that the strings from the TLS specification wouldn't appear in the code, which is not good. If we choose to do that, we should keep the string in a comment. But that comes with a risk of typos in the comment. (Typos in the code would be caught immediately by interoperability tests.) Fortunately we don't add or modify such code often.

gilles-peskine-arm avatar Jan 31 '25 15:01 gilles-peskine-arm

I even pulled out my KR C book, "If its size is fixed, the number of characters in the string, not counting the terminating null character, must not exceed the size of the array". I am actually surprised by all of these being defined, fun little endeavor and I learned a new part of C.

billatarm avatar Jan 31 '25 15:01 billatarm

The warning from gcc -Wunterminated-string-initialization doesn't say “your code has undefined behavior”. It says “your code has behavior that may be surprising”.

I thought this was undefined behavior, but K+R does cover it. https://github.com/Mbed-TLS/mbedtls/issues/9944#issuecomment-2627642854.

I don't see any reason to file a bug with GCC: the warning makes sense. In this case, we intended the behavior.

Yes, I don't disagree that the behavior was intentional (it was commented as such), I'm just thinking if I set the size field, why warn me? as this is an intentional use.

I am trying to deduce one of the following:

  1. Did Fedora add some flag to the build flags?
  • Doesn't seem so, but I also cannot reproduce directly using rpm --eval '%{optflags}' for C flags on Rawhide.
  1. Did GCC introduce this for GCC15?
  • Not sure, need to sleuth.
  1. Did GCC add this to a set of flags?
  • Doesn't seem that -Wall added it.

If it's 2 or 3, I would file a bug with GCC to remove it from the default set as being overly pedantic. If it's 1 then I need to find out why Fedora added it and see if folks can opt out.

billatarm avatar Jan 31 '25 15:01 billatarm

It's (2) and (3). -Wunterminated-string-initialization is new in GCC 15 and included in -Wextra. I don't see it yet in the release notes, but GCC 14.2.0 doesn't have it.

Our CI enforces a clean build with -Werror -Wall -Wextra with a couple of different versions of GCC and Clang, and we normally aim at a clean build with other versions as well. GCC 15 is not yet in scope because it's still unreleased, but it's reasonable to expect that the warning will remain there in the official release, so it would be good for us to fix it.

I don't know what the best fix is. In test code, using braced initializers or putting a null terminator in the test data would be ok. (Although I slightly dislike having test data that happens to be a null-terminated string where a byte string is expected, because that can hide bugs if the code somehow starts using a string function and thus stops working on arbitrary byte strings.) In the TLS code, it's easier to understand the code if the strings from the protocol definition appear in a text search, and the strings are byte strings and we don't particularly want to waste memory and effort to put a null byte at the end but process only the non-null part. This makes a warning-off pragma look attractive, but I'm not sure if that's possible with the way the macros are structured.

gilles-peskine-arm avatar Jan 31 '25 17:01 gilles-peskine-arm

GCC 15 is now released. I don't see -Wunterminated-string-initialization in the release notes, but it seems that it is indeed in the release.

We'll try to silence the warning in our next release. Note that we have a lot of binary data in byte arrays, so having no null byte at the end is definitely intended. The fix may be #pragma GCC diagnostic warning "-Wno-unterminated-string-initialization", or __attribute__((__nonstring__)).

gilles-peskine-arm avatar Apr 29 '25 09:04 gilles-peskine-arm

GCC 15 is now released. I don't see -Wunterminated-string-initialization in the release notes, but it seems that it is indeed in the release.

We'll try to silence the warning in our next release. Note that we have a lot of binary data in byte arrays, so having no null byte at the end is definitely intended. The fix may be #pragma GCC diagnostic warning "-Wno-unterminated-string-initialization", or __attribute__((__nonstring__)).

non-string would be good if you don't have to target non-supporting compilers, is it just GCC like things (gcc clang)? But I guess you can just macro that and pre-process it out on non-supporting compilers.

Another way in pure C without compiler support is to just use the array initializers, ie:

char foo[3] = { 'b', 'a', 'r' };

over

char foo[3] = "";

Annoying for sure....

williamcroberts avatar Apr 30 '25 16:04 williamcroberts

Try using size one more than the literal length

PrasannaBhavan2005 avatar May 02 '25 14:05 PrasannaBhavan2005

Try using size one more than the literal length

Its very intentional. They want a string literal initializer without the NULL byte and so sizeof just works. In some cases it would be trivial to sizeof(x) - 1, but there are some tables and uses that would need proper updating. Either way would work. In this route, the best bet would leave the size portion of the char array off as it superfluous. Ie char foo[] = "bar"; and then adjust sizeof usages.

billatarm avatar May 03 '25 17:05 billatarm

Summary of the remaining steps to fix this issue:

  1. Merge https://github.com/Mbed-TLS/mbedtls/pull/10216
  2. Merge https://github.com/Mbed-TLS/mbedtls-framework/pull/173 (which depends on the above for a clean CI)
  3. Create 2 PRs to mbedtls development and 3.6 that: a. Update the framework pointer to include the above b. Update component_test_gcc15_drivers_opt() (from tests/scripts/components-compiler.sh) to remove -Wno-error=unterminated-string-initialization and the associated comment.

After steps 1 and 2 the actual problems should be fixed. So, if we can get 1 and 2 by tomorrow (3.6.4 code freeze) we'll get most of the benefits (ie users can build 3.6.4 including tests with GCC 15 -Wall -Wextra cleanly), but we'll still need step 3 (possibly after the 3.6.4 code freeze) before we can consider this issue closed.

Cc @ariwo17 @ronald-cron-arm @minosgalanakis

mpg avatar Jun 19 '25 09:06 mpg

Re-opening, we still need non-regression tests.

I suggest we wait for the release sync PRs to be merged, as they will update the pointers. Then we can do PRs for mbedtls-development and 3.6 updating component_test_gcc15_drivers_opt() (from tests/scripts/components-compiler.sh) to remove -Wno-error=unterminated-string-initialization and the associated comment.

mpg avatar Jun 23 '25 07:06 mpg

I suggest we wait for the release sync PRs to be merged, as they will update the pointers. Then we can do PRs for mbedtls-development and 3.6 updating component_test_gcc15_drivers_opt() (from tests/scripts/components-compiler.sh) to remove -Wno-error=unterminated-string-initialization and the associated comment.

@felixc-arm perhaps something you can pick up this week? Hopefully should be very straightforward.

mpg avatar Jun 23 '25 10:06 mpg

Closed by #10244 and #10245

mpg avatar Jul 01 '25 10:07 mpg