Deserializing JSON message with a map of Structs, find() fails on the Struct's fields()
What version of protobuf and what language are you using? Version: v26.0-rc2 Language: C++
What operating system (Linux, Windows, ...) and version? Linux Ubuntu 20
What runtime / compiler are you using (e.g., python version or gcc version) GCC 9
What did you do? With a protobuf message defined like the following:
syntax = "proto3";
import "google/protobuf/struct.proto";
package test;
message Content
{
map<string, google.protobuf.Struct> components = 1;
}
And a JSON string that looks like the following:
{
"components": {
"comp1": {
"size": 1234,
"tag": "hello"
},
"comp2": {
"size": 5678,
"tag": "world"
}
}
}
- Call JsonStringToMessage.
- Find the
comp2component (doesn't matter if viafind()or iteration). - Call
find()on the google.protobuf.Struct'sfields()map to get thetagfield's value.
What did you expect to see
An iterator to the tag field.
What did you see instead?
std::end().
UPDATE 2024-02-21 (v26.0-rc2): Inconsistently!
Note that if step 3 is done via iteration of the fields in a for loop, then tag is iterated correctly.
Additional Information
- Reproduced on v25.0, v24.4 and v22.5 as well.
- Does not reproduce on v21.12.
- Does not reproduce on Windows using VS2019 in any of the aforementioned versions.
- Compiled using C++ 17 standard.
Code Example Attached is a short C++ sample test code that illustrates the problem:
#include <iostream>
#include <string>
#include <google/protobuf/util/json_util.h>
#include "test.pb.h"
using namespace google::protobuf::util;
static const std::string content_json = R"(
{
"components": {
"comp1": {
"size": 16216459,
"tag": "5f8e2a5b91b516f8eda6ef0ca0fd07bf96d3029bdf2ba9281e220a18cd542fc1"
},
"comp2": {
"size": 22816265,
"tag": "917b2d0d83ef8aa3ebcd169b53a0b6e9c72fee779785fae4f63cdec5694d5e47"
}
}
}
)";
int main()
{
// Convert JSON to protobuf.
JsonParseOptions options;
test::Content content;
absl::Status status = JsonStringToMessage(content_json, &content, options);
// Get the fields of the 'comp2'.
const auto component_iter = content.components().find("comp2");
const auto& fields = component_iter->second.fields();
std::cout << "Debug string: " << std::endl;
std::cout << content.DebugString() << std::endl;
std::cout << "Field count: " << fields.size() << std::endl;
for (const auto& field : fields)
{
std::cout << "Key name: '" << field.first << "'" << std::endl;
}
// Find the 'tag' field of the component using the 'find' method.
const auto tag_iter = fields.find("tag");
if (tag_iter == std::end(fields))
{
std::cout << "No tag via find()" << std::endl;
return -1;
}
std::cout << "Tag debug string: " << tag_iter->second.DebugString() << std::endl;
return 0;
}
Program output:
Debug string:
components {
key: "comp1"
value {
fields {
key: "size"
value {
number_value: 16216459
}
}
fields {
key: "tag"
value {
string_value: "5f8e2a5b91b516f8eda6ef0ca0fd07bf96d3029bdf2ba9281e220a18cd542fc1"
}
}
}
}
components {
key: "comp2"
value {
fields {
key: "size"
value {
number_value: 22816265
}
}
fields {
key: "tag"
value {
string_value: "917b2d0d83ef8aa3ebcd169b53a0b6e9c72fee779785fae4f63cdec5694d5e47"
}
}
}
}
Field count: 2
Key name: 'size'
Key name: 'tag'
No tag via find()
@esrauchg please take a look
@gazar Do you mind narrowing down the bug report here? Is 'fields' zero length? Is there an entry for 'tag' but it doesn't have a string value or has an empty string?
I ran your repro code internally the listed behavior doesn't reproduce: it does find the tag and does not hit the "No tag via find" conditional. You may also find something interesting here if you remove the 'ignore_unknown_fields' and see if that gives you a parse failure from e.g. stale gencode or something (though its hard to see why that would be in this case)
Perhaps you can update your case, print the content.DebugString() to see what it says?
@esrauchg thank you for looking into this.
I've updated the example to shorten it and added more prints. I've managed to reproduce the issue again, on v26.0-rc2. However, this time the issue reproduces inconsistently. The following two prints sometimes change order, but the issue may reproduce regardless of their order:
Key name: 'size'
Key name: 'tag'
We triage inactive PRs and issues in order to make it easier to find active work. If this issue should remain active or becomes active again, please add a comment.
This issue is labeled inactive because the last activity was over 90 days ago.
Still waiting for someone to have a closer look at the issue.
I tried your repro snippet and don't see the issue reproduce in internal main tip.
Can you clarify further the environment that you get the issue? If you're only able to reproduce this under one specific Ubuntu configuration then I'm suspicious that this is most likely an issue of the system having some vendored protobuf package installed that is being linked compared to your intended protobuf version (where the latter is the used for the .pb.h generation); C++ Protobuf requires exact version match between the protoc and the linked proto runtime (even 'minor' version releases are not guaranteed to be skew safe compatible between generated code and runtime).
@esrauchg thank you for taking the time to look into this further.
- The issue reproduces inconsistently - I'm still getting results similar to my comment from Feb 21st.
- The issue reproduces also in version v27.0-rc3.
- The executable uses the correct protobuf shared libraries:
# ldd -d bin/test
linux-vdso.so.1 (0x00007fff9d1b9000)
libprotobuf.so.27.0.3 => /home/gazar/.conan/data/protobuf/27.0-rc3+2/pan/dev/package/4dd49b453a8d01f6f5acb40cd2c983d832a6da45/lib/libprotobuf.so.27.0.3 (0x00007fe22d2d4000)
libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007fe22d2a7000)
libstdc++.so.6 => /lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007fe22d0c5000)
libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007fe22cf76000)
libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007fe22cf5b000)
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007fe22cd67000)
/lib64/ld-linux-x86-64.so.2 (0x00007fe22d70b000)
- I'm using the correct protoc executable to regenerate the C++ files from the .proto file:
/home/gazar/.conan/data/protobuf/27.0-rc3+2/pan/dev/package/4dd49b453a8d01f6f5acb40cd2c983d832a6da45/bin/protoc --cpp_out=. test.proto - Note the use of the older GCC 9 compiler on Ubuntu 20.04. I'm using WSL but the issue reproduced on non-WSL test VMs.
- I've built both protobuf and the test executable with the
_GLIBCXX_USE_CXX11_ABI=0flag.
Can you provide the specific gcc version number that you're seeing the issue on? Thanks!
$ g++-9 --version
g++-9 (Ubuntu 9.4.0-1ubuntu1~20.04.2) 9.4.0