protobuf icon indicating copy to clipboard operation
protobuf copied to clipboard

Deserializing JSON message with a map of Structs, find() fails on the Struct's fields()

Open gazar opened this issue 2 years ago • 10 comments

What version of protobuf and what language are you using? Version: v26.0-rc2 Language: C++

What operating system (Linux, Windows, ...) and version? Linux Ubuntu 20

What runtime / compiler are you using (e.g., python version or gcc version) GCC 9

What did you do? With a protobuf message defined like the following:

syntax = "proto3";
import "google/protobuf/struct.proto";

package test;

message Content
{
    map<string, google.protobuf.Struct> components = 1;
}

And a JSON string that looks like the following:

{
    "components": {
      "comp1": {
        "size": 1234,
        "tag": "hello"
      },
      "comp2": {
        "size": 5678,
        "tag": "world"
      }
    }
}
  1. Call JsonStringToMessage.
  2. Find the comp2 component (doesn't matter if via find() or iteration).
  3. Call find() on the google.protobuf.Struct's fields() map to get the tag field's value.

What did you expect to see An iterator to the tag field.

What did you see instead? std::end(). UPDATE 2024-02-21 (v26.0-rc2): Inconsistently!

Note that if step 3 is done via iteration of the fields in a for loop, then tag is iterated correctly.

Additional Information

  • Reproduced on v25.0, v24.4 and v22.5 as well.
  • Does not reproduce on v21.12.
  • Does not reproduce on Windows using VS2019 in any of the aforementioned versions.
  • Compiled using C++ 17 standard.

Code Example Attached is a short C++ sample test code that illustrates the problem:

#include <iostream>
#include <string>
#include <google/protobuf/util/json_util.h>
#include "test.pb.h"

using namespace google::protobuf::util;

static const std::string content_json = R"(
{
    "components": {
      "comp1": {
        "size": 16216459,
        "tag": "5f8e2a5b91b516f8eda6ef0ca0fd07bf96d3029bdf2ba9281e220a18cd542fc1"
      },
      "comp2": {
        "size": 22816265,
        "tag": "917b2d0d83ef8aa3ebcd169b53a0b6e9c72fee779785fae4f63cdec5694d5e47"
      }
    }
}
)";

int main()
{
    // Convert JSON to protobuf.
    JsonParseOptions options;
    test::Content content;
    absl::Status status = JsonStringToMessage(content_json, &content, options);

    // Get the fields of the 'comp2'.
    const auto component_iter = content.components().find("comp2");
    const auto& fields = component_iter->second.fields();

    std::cout << "Debug string: " << std::endl;
    std::cout << content.DebugString() << std::endl;
    std::cout << "Field count: " << fields.size() << std::endl;
    for (const auto& field : fields)
    {
        std::cout << "Key name: '" << field.first << "'" << std::endl;
    }

    // Find the 'tag' field of the component using the 'find' method.
    const auto tag_iter = fields.find("tag");
    if (tag_iter == std::end(fields))
    {
        std::cout << "No tag via find()" << std::endl;
        return -1;
    }

    std::cout << "Tag debug string: " << tag_iter->second.DebugString() << std::endl;

    return 0;
}

Program output:

Debug string: 
components {
  key: "comp1"
  value {
    fields {
      key: "size"
      value {
        number_value: 16216459
      }
    }
    fields {
      key: "tag"
      value {
        string_value: "5f8e2a5b91b516f8eda6ef0ca0fd07bf96d3029bdf2ba9281e220a18cd542fc1"
      }
    }
  }
}
components {
  key: "comp2"
  value {
    fields {
      key: "size"
      value {
        number_value: 22816265
      }
    }
    fields {
      key: "tag"
      value {
        string_value: "917b2d0d83ef8aa3ebcd169b53a0b6e9c72fee779785fae4f63cdec5694d5e47"
      }
    }
  }
}

Field count: 2
Key name: 'size'
Key name: 'tag'
No tag via find()

gazar avatar Dec 13 '23 13:12 gazar

@esrauchg please take a look

googleberg avatar Feb 17 '24 21:02 googleberg

@gazar Do you mind narrowing down the bug report here? Is 'fields' zero length? Is there an entry for 'tag' but it doesn't have a string value or has an empty string?

I ran your repro code internally the listed behavior doesn't reproduce: it does find the tag and does not hit the "No tag via find" conditional. You may also find something interesting here if you remove the 'ignore_unknown_fields' and see if that gives you a parse failure from e.g. stale gencode or something (though its hard to see why that would be in this case)

Perhaps you can update your case, print the content.DebugString() to see what it says?

esrauchg avatar Feb 20 '24 14:02 esrauchg

@esrauchg thank you for looking into this.

I've updated the example to shorten it and added more prints. I've managed to reproduce the issue again, on v26.0-rc2. However, this time the issue reproduces inconsistently. The following two prints sometimes change order, but the issue may reproduce regardless of their order:

Key name: 'size'
Key name: 'tag'

gazar avatar Feb 21 '24 10:02 gazar

We triage inactive PRs and issues in order to make it easier to find active work. If this issue should remain active or becomes active again, please add a comment.

This issue is labeled inactive because the last activity was over 90 days ago.

github-actions[bot] avatar May 22 '24 10:05 github-actions[bot]

Still waiting for someone to have a closer look at the issue.

gazar avatar May 22 '24 12:05 gazar

I tried your repro snippet and don't see the issue reproduce in internal main tip.

Can you clarify further the environment that you get the issue? If you're only able to reproduce this under one specific Ubuntu configuration then I'm suspicious that this is most likely an issue of the system having some vendored protobuf package installed that is being linked compared to your intended protobuf version (where the latter is the used for the .pb.h generation); C++ Protobuf requires exact version match between the protoc and the linked proto runtime (even 'minor' version releases are not guaranteed to be skew safe compatible between generated code and runtime).

esrauchg avatar May 22 '24 13:05 esrauchg

@esrauchg thank you for taking the time to look into this further.

  1. The issue reproduces inconsistently - I'm still getting results similar to my comment from Feb 21st.
  2. The issue reproduces also in version v27.0-rc3.
  3. The executable uses the correct protobuf shared libraries:
# ldd -d bin/test
        linux-vdso.so.1 (0x00007fff9d1b9000)
        libprotobuf.so.27.0.3 => /home/gazar/.conan/data/protobuf/27.0-rc3+2/pan/dev/package/4dd49b453a8d01f6f5acb40cd2c983d832a6da45/lib/libprotobuf.so.27.0.3 (0x00007fe22d2d4000)
        libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007fe22d2a7000)
        libstdc++.so.6 => /lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007fe22d0c5000)
        libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007fe22cf76000)
        libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007fe22cf5b000)
        libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007fe22cd67000)
        /lib64/ld-linux-x86-64.so.2 (0x00007fe22d70b000)
  1. I'm using the correct protoc executable to regenerate the C++ files from the .proto file: /home/gazar/.conan/data/protobuf/27.0-rc3+2/pan/dev/package/4dd49b453a8d01f6f5acb40cd2c983d832a6da45/bin/protoc --cpp_out=. test.proto
  2. Note the use of the older GCC 9 compiler on Ubuntu 20.04. I'm using WSL but the issue reproduced on non-WSL test VMs.
  3. I've built both protobuf and the test executable with the _GLIBCXX_USE_CXX11_ABI=0 flag.

gazar avatar May 22 '24 16:05 gazar

Can you provide the specific gcc version number that you're seeing the issue on? Thanks!

esrauchg avatar May 22 '24 16:05 esrauchg

$ g++-9 --version
g++-9 (Ubuntu 9.4.0-1ubuntu1~20.04.2) 9.4.0

gazar avatar May 22 '24 16:05 gazar