fmt icon indicating copy to clipboard operation
fmt copied to clipboard

GCC 14 optimizer doesn't like std::filesystem::path formatting (Wstringop-overflow)

Open Blzut3 opened this issue 1 year ago • 5 comments

Starting with 11.0.2, when using GCC 14, particularly with link time optimization but also reproducible in some cases without, it seems that formatting std::filesystem::path (i.e. fmt::to_string(std::filesystem::path {...})) produces stringop-overflow warnings. While this can be reproduced by just calling fmt::to_string<std::filesystem::path> in exploring I found that explicitly instantiating two particular templates seems to be enough to trigger it (one or the other is not enough). Using O3 optimizations seems to be required for this to trigger.

https://godbolt.org/z/Gsvoqz71e

I've traced the change causing this to commit f29a7e7970349eec2c76f3c0eda3d5accdd3aa6b. Specifically just the removal of the copy overload there. Reverting that overload removal is one possible workaround.

Despite the functions that are instantiated there, the actual function in question seems to be for_each_codepoint. Increasing the size of buf by up to 8 characters eliminates the warnings. I've also had success in getting rid of them by storing the start and end pointers for the string instead and removing the if (s.size() >= block_size) check to simplify the flow (might have needed an assume somewhere in there too). Ultimately this feels like a false positive to me, and I can report to GCC if agreed.

I've tried various combinations of [[assume()]] without changing the code structure with no avail, so GCC seems particularly stubborn about this one.

In file included from /opt/compiler-explorer/libs/fmt/trunk/include/fmt/format.h:41,
                 from /opt/compiler-explorer/libs/fmt/trunk/include/fmt/std.h:11,
                 from <source>:1:
In function 'constexpr OutputIt fmt::v11::detail::copy(InputIt, InputIt, OutputIt) [with T = char; InputIt = const char*; OutputIt = char*; typename std::enable_if<(! is_back_insert_iterator<OutputIt>::value), int>::type <anonymous> = 0]',
    inlined from 'constexpr void fmt::v11::detail::for_each_codepoint(fmt::v11::string_view, F) [with F = find_escape(const char*, const char*)::<lambda(uint32_t, fmt::v11::string_view)>]' at /opt/compiler-explorer/libs/fmt/trunk/include/fmt/format.h:678:15,
    inlined from 'fmt::v11::detail::find_escape_result<char> fmt::v11::detail::find_escape(const char*, const char*)' at /opt/compiler-explorer/libs/fmt/trunk/include/fmt/format.h:1849:21,
    inlined from 'OutputIt fmt::v11::detail::write_escaped_string(OutputIt, fmt::v11::basic_string_view<Char>) [with Char = char; OutputIt = fmt::v11::basic_appender<char>]' at /opt/compiler-explorer/libs/fmt/trunk/include/fmt/format.h:1941:30:
/opt/compiler-explorer/libs/fmt/trunk/include/fmt/base.h:1220:31: warning: writing 8 bytes into a region of size 7 [-Wstringop-overflow=]
 1220 |   while (begin != end) *out++ = static_cast<T>(*begin++);
      |                        ~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~
/opt/compiler-explorer/libs/fmt/trunk/include/fmt/format.h: In function 'OutputIt fmt::v11::detail::write_escaped_string(OutputIt, fmt::v11::basic_string_view<Char>) [with Char = char; OutputIt = fmt::v11::basic_appender<char>]':
/opt/compiler-explorer/libs/fmt/trunk/include/fmt/format.h:677:10: note: destination object 'buf' of size 7
  677 |     char buf[2 * block_size - 1] = {};
      |          ^~~
In function 'constexpr OutputIt fmt::v11::detail::copy(InputIt, InputIt, OutputIt) [with T = char; InputIt = const char*; OutputIt = char*; typename std::enable_if<(! is_back_insert_iterator<OutputIt>::value), int>::type <anonymous> = 0]',
    inlined from 'constexpr void fmt::v11::detail::for_each_codepoint(fmt::v11::string_view, F) [with F = find_escape(const char*, const char*)::<lambda(uint32_t, fmt::v11::string_view)>]' at /opt/compiler-explorer/libs/fmt/trunk/include/fmt/format.h:678:15,
    inlined from 'fmt::v11::detail::find_escape_result<char> fmt::v11::detail::find_escape(const char*, const char*)' at /opt/compiler-explorer/libs/fmt/trunk/include/fmt/format.h:1849:21,
    inlined from 'OutputIt fmt::v11::detail::write_escaped_string(OutputIt, fmt::v11::basic_string_view<Char>) [with Char = char; OutputIt = fmt::v11::basic_appender<char>]' at /opt/compiler-explorer/libs/fmt/trunk/include/fmt/format.h:1941:30:
/opt/compiler-explorer/libs/fmt/trunk/include/fmt/base.h:1220:31: warning: writing 1 byte into a region of size 0 [-Wstringop-overflow=]
 1220 |   while (begin != end) *out++ = static_cast<T>(*begin++);
      |                        ~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~
/opt/compiler-explorer/libs/fmt/trunk/include/fmt/format.h: In function 'OutputIt fmt::v11::detail::write_escaped_string(OutputIt, fmt::v11::basic_string_view<Char>) [with Char = char; OutputIt = fmt::v11::basic_appender<char>]':
/opt/compiler-explorer/libs/fmt/trunk/include/fmt/format.h:677:10: note: at offset 16 into destination object 'buf' of size 7
  677 |     char buf[2 * block_size - 1] = {};
      |          ^~~
/opt/compiler-explorer/libs/fmt/trunk/include/fmt/format.h:677:10: note: destination object 'buf' of size 7
/opt/compiler-explorer/libs/fmt/trunk/include/fmt/format.h:677:10: note: at offset 16 into destination object 'buf' of size 7
In function 'constexpr OutputIt fmt::v11::detail::copy(InputIt, InputIt, OutputIt) [with T = char; InputIt = const char*; OutputIt = char*; typename std::enable_if<(! is_back_insert_iterator<OutputIt>::value), int>::type <anonymous> = 0]',
    inlined from 'constexpr void fmt::v11::detail::for_each_codepoint(fmt::v11::string_view, F) [with F = find_escape(const char*, const char*)::<lambda(uint32_t, fmt::v11::string_view)>]' at /opt/compiler-explorer/libs/fmt/trunk/include/fmt/format.h:678:15,
    inlined from 'fmt::v11::detail::find_escape_result<char> fmt::v11::detail::find_escape(const char*, const char*)' at /opt/compiler-explorer/libs/fmt/trunk/include/fmt/format.h:1849:21,
    inlined from 'OutputIt fmt::v11::detail::write_escaped_string(OutputIt, fmt::v11::basic_string_view<Char>) [with Char = char; OutputIt = fmt::v11::basic_appender<char>]' at /opt/compiler-explorer/libs/fmt/trunk/include/fmt/format.h:1941:30:
/opt/compiler-explorer/libs/fmt/trunk/include/fmt/base.h:1220:31: warning: writing 1 byte into a region of size 0 [-Wstringop-overflow=]
 1220 |   while (begin != end) *out++ = static_cast<T>(*begin++);
      |                        ~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~
/opt/compiler-explorer/libs/fmt/trunk/include/fmt/format.h: In function 'OutputIt fmt::v11::detail::write_escaped_string(OutputIt, fmt::v11::basic_string_view<Char>) [with Char = char; OutputIt = fmt::v11::basic_appender<char>]':
/opt/compiler-explorer/libs/fmt/trunk/include/fmt/format.h:677:10: note: at offset 17 into destination object 'buf' of size 7
  677 |     char buf[2 * block_size - 1] = {};
      |          ^~~
/opt/compiler-explorer/libs/fmt/trunk/include/fmt/format.h:677:10: note: at offset [1, 7] into destination object 'buf' of size 7
/opt/compiler-explorer/libs/fmt/trunk/include/fmt/format.h:677:10: note: at offset 17 into destination object 'buf' of size 7
In function 'constexpr OutputIt fmt::v11::detail::copy(InputIt, InputIt, OutputIt) [with T = char; InputIt = const char*; OutputIt = char*; typename std::enable_if<(! is_back_insert_iterator<OutputIt>::value), int>::type <anonymous> = 0]',
    inlined from 'constexpr void fmt::v11::detail::for_each_codepoint(fmt::v11::string_view, F) [with F = find_escape(const char*, const char*)::<lambda(uint32_t, fmt::v11::string_view)>]' at /opt/compiler-explorer/libs/fmt/trunk/include/fmt/format.h:678:15,
    inlined from 'fmt::v11::detail::find_escape_result<char> fmt::v11::detail::find_escape(const char*, const char*)' at /opt/compiler-explorer/libs/fmt/trunk/include/fmt/format.h:1849:21,
    inlined from 'OutputIt fmt::v11::detail::write_escaped_string(OutputIt, fmt::v11::basic_string_view<Char>) [with Char = char; OutputIt = fmt::v11::basic_appender<char>]' at /opt/compiler-explorer/libs/fmt/trunk/include/fmt/format.h:1941:30:
/opt/compiler-explorer/libs/fmt/trunk/include/fmt/base.h:1220:31: warning: writing 1 byte into a region of size 0 [-Wstringop-overflow=]
 1220 |   while (begin != end) *out++ = static_cast<T>(*begin++);
      |                        ~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~
/opt/compiler-explorer/libs/fmt/trunk/include/fmt/format.h: In function 'OutputIt fmt::v11::detail::write_escaped_string(OutputIt, fmt::v11::basic_string_view<Char>) [with Char = char; OutputIt = fmt::v11::basic_appender<char>]':
/opt/compiler-explorer/libs/fmt/trunk/include/fmt/format.h:677:10: note: at offset 18 into destination object 'buf' of size 7
  677 |     char buf[2 * block_size - 1] = {};
      |          ^~~
/opt/compiler-explorer/libs/fmt/trunk/include/fmt/format.h:677:10: note: at offset [2, 7] into destination object 'buf' of size 7
/opt/compiler-explorer/libs/fmt/trunk/include/fmt/format.h:677:10: note: at offset 18 into destination object 'buf' of size 7
In function 'constexpr OutputIt fmt::v11::detail::copy(InputIt, InputIt, OutputIt) [with T = char; InputIt = const char*; OutputIt = char*; typename std::enable_if<(! is_back_insert_iterator<OutputIt>::value), int>::type <anonymous> = 0]',
    inlined from 'constexpr void fmt::v11::detail::for_each_codepoint(fmt::v11::string_view, F) [with F = find_escape(const char*, const char*)::<lambda(uint32_t, fmt::v11::string_view)>]' at /opt/compiler-explorer/libs/fmt/trunk/include/fmt/format.h:678:15,
    inlined from 'fmt::v11::detail::find_escape_result<char> fmt::v11::detail::find_escape(const char*, const char*)' at /opt/compiler-explorer/libs/fmt/trunk/include/fmt/format.h:1849:21,
    inlined from 'OutputIt fmt::v11::detail::write_escaped_string(OutputIt, fmt::v11::basic_string_view<Char>) [with Char = char; OutputIt = fmt::v11::basic_appender<char>]' at /opt/compiler-explorer/libs/fmt/trunk/include/fmt/format.h:1941:30:
/opt/compiler-explorer/libs/fmt/trunk/include/fmt/base.h:1220:31: warning: writing 1 byte into a region of size 0 [-Wstringop-overflow=]
 1220 |   while (begin != end) *out++ = static_cast<T>(*begin++);
      |                        ~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~
/opt/compiler-explorer/libs/fmt/trunk/include/fmt/format.h: In function 'OutputIt fmt::v11::detail::write_escaped_string(OutputIt, fmt::v11::basic_string_view<Char>) [with Char = char; OutputIt = fmt::v11::basic_appender<char>]':
/opt/compiler-explorer/libs/fmt/trunk/include/fmt/format.h:677:10: note: at offset 19 into destination object 'buf' of size 7
  677 |     char buf[2 * block_size - 1] = {};
      |          ^~~
/opt/compiler-explorer/libs/fmt/trunk/include/fmt/format.h:677:10: note: at offset [3, 7] into destination object 'buf' of size 7
/opt/compiler-explorer/libs/fmt/trunk/include/fmt/format.h:677:10: note: at offset 19 into destination object 'buf' of size 7
In function 'constexpr OutputIt fmt::v11::detail::copy(InputIt, InputIt, OutputIt) [with T = char; InputIt = const char*; OutputIt = char*; typename std::enable_if<(! is_back_insert_iterator<OutputIt>::value), int>::type <anonymous> = 0]',
    inlined from 'constexpr void fmt::v11::detail::for_each_codepoint(fmt::v11::string_view, F) [with F = find_escape(const char*, const char*)::<lambda(uint32_t, fmt::v11::string_view)>]' at /opt/compiler-explorer/libs/fmt/trunk/include/fmt/format.h:678:15,
    inlined from 'fmt::v11::detail::find_escape_result<char> fmt::v11::detail::find_escape(const char*, const char*)' at /opt/compiler-explorer/libs/fmt/trunk/include/fmt/format.h:1849:21,
    inlined from 'OutputIt fmt::v11::detail::write_escaped_string(OutputIt, fmt::v11::basic_string_view<Char>) [with Char = char; OutputIt = fmt::v11::basic_appender<char>]' at /opt/compiler-explorer/libs/fmt/trunk/include/fmt/format.h:1941:30:
/opt/compiler-explorer/libs/fmt/trunk/include/fmt/base.h:1220:31: warning: writing 1 byte into a region of size 0 [-Wstringop-overflow=]
 1220 |   while (begin != end) *out++ = static_cast<T>(*begin++);
      |                        ~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~
/opt/compiler-explorer/libs/fmt/trunk/include/fmt/format.h: In function 'OutputIt fmt::v11::detail::write_escaped_string(OutputIt, fmt::v11::basic_string_view<Char>) [with Char = char; OutputIt = fmt::v11::basic_appender<char>]':
/opt/compiler-explorer/libs/fmt/trunk/include/fmt/format.h:677:10: note: at offset 20 into destination object 'buf' of size 7
  677 |     char buf[2 * block_size - 1] = {};
      |          ^~~
/opt/compiler-explorer/libs/fmt/trunk/include/fmt/format.h:677:10: note: at offset [4, 7] into destination object 'buf' of size 7
/opt/compiler-explorer/libs/fmt/trunk/include/fmt/format.h:677:10: note: at offset 20 into destination object 'buf' of size 7
In function 'constexpr OutputIt fmt::v11::detail::copy(InputIt, InputIt, OutputIt) [with T = char; InputIt = const char*; OutputIt = char*; typename std::enable_if<(! is_back_insert_iterator<OutputIt>::value), int>::type <anonymous> = 0]',
    inlined from 'constexpr void fmt::v11::detail::for_each_codepoint(fmt::v11::string_view, F) [with F = find_escape(const char*, const char*)::<lambda(uint32_t, fmt::v11::string_view)>]' at /opt/compiler-explorer/libs/fmt/trunk/include/fmt/format.h:678:15,
    inlined from 'fmt::v11::detail::find_escape_result<char> fmt::v11::detail::find_escape(const char*, const char*)' at /opt/compiler-explorer/libs/fmt/trunk/include/fmt/format.h:1849:21,
    inlined from 'OutputIt fmt::v11::detail::write_escaped_string(OutputIt, fmt::v11::basic_string_view<Char>) [with Char = char; OutputIt = fmt::v11::basic_appender<char>]' at /opt/compiler-explorer/libs/fmt/trunk/include/fmt/format.h:1941:30:
/opt/compiler-explorer/libs/fmt/trunk/include/fmt/base.h:1220:31: warning: writing 1 byte into a region of size 0 [-Wstringop-overflow=]
 1220 |   while (begin != end) *out++ = static_cast<T>(*begin++);
      |                        ~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~
/opt/compiler-explorer/libs/fmt/trunk/include/fmt/format.h: In function 'OutputIt fmt::v11::detail::write_escaped_string(OutputIt, fmt::v11::basic_string_view<Char>) [with Char = char; OutputIt = fmt::v11::basic_appender<char>]':
/opt/compiler-explorer/libs/fmt/trunk/include/fmt/format.h:677:10: note: at offset 21 into destination object 'buf' of size 7
  677 |     char buf[2 * block_size - 1] = {};
      |          ^~~
/opt/compiler-explorer/libs/fmt/trunk/include/fmt/format.h:677:10: note: at offset [5, 7] into destination object 'buf' of size 7
/opt/compiler-explorer/libs/fmt/trunk/include/fmt/format.h:677:10: note: at offset 21 into destination object 'buf' of size 7
In function 'constexpr OutputIt fmt::v11::detail::copy(InputIt, InputIt, OutputIt) [with T = char; InputIt = const char*; OutputIt = char*; typename std::enable_if<(! is_back_insert_iterator<OutputIt>::value), int>::type <anonymous> = 0]',
    inlined from 'constexpr void fmt::v11::detail::for_each_codepoint(fmt::v11::string_view, F) [with F = find_escape(const char*, const char*)::<lambda(uint32_t, fmt::v11::string_view)>]' at /opt/compiler-explorer/libs/fmt/trunk/include/fmt/format.h:678:15,
    inlined from 'fmt::v11::detail::find_escape_result<char> fmt::v11::detail::find_escape(const char*, const char*)' at /opt/compiler-explorer/libs/fmt/trunk/include/fmt/format.h:1849:21,
    inlined from 'OutputIt fmt::v11::detail::write_escaped_string(OutputIt, fmt::v11::basic_string_view<Char>) [with Char = char; OutputIt = fmt::v11::basic_appender<char>]' at /opt/compiler-explorer/libs/fmt/trunk/include/fmt/format.h:1941:30:
/opt/compiler-explorer/libs/fmt/trunk/include/fmt/base.h:1220:31: warning: writing 1 byte into a region of size 0 [-Wstringop-overflow=]
 1220 |   while (begin != end) *out++ = static_cast<T>(*begin++);
      |                        ~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~
/opt/compiler-explorer/libs/fmt/trunk/include/fmt/format.h: In function 'OutputIt fmt::v11::detail::write_escaped_string(OutputIt, fmt::v11::basic_string_view<Char>) [with Char = char; OutputIt = fmt::v11::basic_appender<char>]':
/opt/compiler-explorer/libs/fmt/trunk/include/fmt/format.h:677:10: note: at offset 22 into destination object 'buf' of size 7
  677 |     char buf[2 * block_size - 1] = {};
      |          ^~~
/opt/compiler-explorer/libs/fmt/trunk/include/fmt/format.h:677:10: note: at offset [6, 7] into destination object 'buf' of size 7
/opt/compiler-explorer/libs/fmt/trunk/include/fmt/format.h:677:10: note: at offset 22 into destination object 'buf' of size 7

Blzut3 avatar Aug 19 '24 22:08 Blzut3

I'm trying to upgrade my project to fmt 11 and noticed the same gcc warning

macdew avatar Aug 21 '24 08:08 macdew

Looks like a compiler bug

sunmy2019 avatar Aug 23 '24 14:08 sunmy2019

I investigated it.

The difference here is GCC14 does one more inline. You can mimic the GCC14 behavior in GCC13 by adding [[gnu::always_inline]] attribute to write_escaped_cp function.

With always inlining write_escaped_cp function, this problem dates back to GCC11.

sunmy2019 avatar Aug 24 '24 07:08 sunmy2019

So I inlined all those functions manually. Found it keeps warning the following code snippets.

https://godbolt.org/z/Y3E5fh9G6

   if (num_chars_left > 0 && num_chars_left < block_size) {

      char buf[2 * block_size - 1] = {};

      {
        auto _begin = p;
        auto _out = buf;

        while (_begin < p + num_chars_left) *_out++ = *_begin++; // warns here.
      }
Full Code Snippet

#include <fmt/format.h>

namespace fmt {
inline namespace v11 {
namespace detail {

auto write_escaped_string(basic_appender<char> out,
                          const basic_string_view<char> str) {
  *out++ = '"';

  const auto str_data = str.data();
  const auto str_size = str.size();

  auto begin = str_data, end = str_data + str_size;

  do {
    find_escape_result<char> escape;

    auto decode = [&](const char* buf_ptr, const char* ptr) -> const char* {
      auto cp = uint32_t();
      auto error = 0;
      auto end = utf8_decode(buf_ptr, &cp, &error);

      if (needs_escape(error ? invalid_code_point : cp)) {
        auto sv = string_view(ptr, error ? 1 : to_unsigned(end - buf_ptr));
        escape = {sv.begin(), sv.end(), cp};
        return nullptr;
      }
      return error ? buf_ptr + 1 : end;
    };

    auto p = begin;
    const size_t block_size = 4;  // utf8_decode always reads blocks of 4 chars.

    if (p <= end - block_size) {
      for (; p <= end - block_size;) {
        p = decode(p, p);
        if (!p) goto escape_found;
      }
    }

    // invariant:
    // 1. p >= end - block_size + 1
    // 2. end - p < block_size

    if (const auto num_chars_left = end - p;
        num_chars_left > 0 && num_chars_left < block_size) {
      // invariant: num_chars_left < block_size

      char buf[2 * block_size - 1] = {};

      {
        auto _begin = p;
        auto _out = buf;

        while (_begin < p + num_chars_left) *_out++ = *_begin++;
      }

      const char* buf_ptr = buf;
      do {
        auto end2 = decode(buf_ptr, p);
        if (!end2) goto escape_found;
        p += end2 - buf_ptr;
        buf_ptr = end2;
      } while (buf_ptr - buf < num_chars_left);
    }

  escape_found:

    out = copy<char>(begin, escape.begin, out);

    begin = escape.end;

    if (!begin) break;

    auto c = escape.cp;
    switch (escape.cp) {
    case '\n':
      *out++ = '\\';
      c = 'n';
      break;
    case '\r':
      *out++ = '\\';
      c = 'r';
      break;
    case '\t':
      *out++ = '\\';
      c = 't';
      break;
    case '"':
      FMT_FALLTHROUGH;
    case '\'':
      FMT_FALLTHROUGH;
    case '\\':
      *out++ = '\\';
      break;
    default:
      if (escape.cp < 0x100) {
        out = write_codepoint<2, char>(out, 'x', escape.cp);
        continue;
      }
      if (escape.cp < 0x10000) {
        out = write_codepoint<4, char>(out, 'u', escape.cp);
        continue;
      }
      if (escape.cp < 0x110000) {
        out = write_codepoint<8, char>(out, 'U', escape.cp);
        continue;
      }

      for (char escape_char : basic_string_view<char>(
               escape.begin, to_unsigned(escape.end - escape.begin))) {
        out = write_codepoint<2, char>(
            out, 'x', static_cast<uint32_t>(escape_char) & 0xFF);
      }
      continue;
    }
    *out++ = c;

  } while (begin != end);

  *out++ = '"';
  return out;
}

}  // namespace detail
}  // namespace v11
}  // namespace fmt

sunmy2019 avatar Aug 24 '24 08:08 sunmy2019

I carefully reads through the logic and decide it should be a false positive. Maybe we will need a hack to suppress that compiler warning.

sunmy2019 avatar Aug 24 '24 08:08 sunmy2019

It does look like false positive so might be worth reporting to gcc but I applied a workaround in https://github.com/fmtlib/fmt/commit/0379bf3a5d52d8542aec1874677c9df5ff9ba5f9.

vitaut avatar Aug 24 '24 15:08 vitaut

It does look like false positive so might be worth reporting to gcc but I applied a workaround in 0379bf3.

Yeah, but it's a temporary workaround. The same warning happens if write_escaped_cp is always inlined.

sunmy2019 avatar Aug 25 '24 14:08 sunmy2019

The same warning happens if write_escaped_cp is always inlined.

If you have a better solution, a PR would be welcome.

vitaut avatar Aug 28 '24 13:08 vitaut