protobuf icon indicating copy to clipboard operation
protobuf copied to clipboard

Serialization & Deserialization not working for unititialized bool

Open psypdt opened this issue 3 years ago • 3 comments

What version of protobuf and what language are you using? Version: 3.19.4.0 Language: C++

What operating system (Linux, Windows, ...) and version? Linux Ubuntu 20.04

What runtime / compiler are you using (e.g., python version or gcc version)

  • g++ 10.2.0
  • I've also tried g++ 9.4.0 giving me the same result

What did you do?

I tried to serialize a protobuf message as a string and then and deserialize back to the original message. All my files compile and link correctly.

I have the following proto file (named fancy_messages.proto):

syntax = "proto3";

message MainMessage {
    uint64 num = 1;
    SubMessage sub_message = 2;
}

message SubMessage {
    bool my_bool = 3;
}

My Cpp file looks as follows:

#include <fancy_messages.pb.h>
#include <iostream>

int main(int argc, char const* argv[])
{
    class Test
    {
      public:
        bool m_isGood;
    };

    MainMessage message;
    message.set_allocated_sub_message(new SubMessage);

    Test* array = new Test[100];

    for (int i = 0; i < 100; i++)
    {
        std::string serialized;

        // Set data and serialize to string
        message.set_num(20);
        message.mutable_sub_message()->set_my_bool(array[i].m_isGood);

        message.SerializeToString(&serialized);

        MainMessage reparsedMessage;

        std::cout << "m_isGood: " << array[i].m_isGood << std::endl;

        // Try to deserialize into new message
        if (!reparsedMessage.ParseFromString(serialized))
        {
            std::cout << "Failed to reparse message" << std::endl;
            break;
        }
    }
    delete[] array;

    return 0;
}

What did you expect to see

I expected to not see the "Failed to reparse message" when reparsing the message, and successfully serialize and reserialize the message.

What did you see instead?

I saw the error message and the deserialization failed to parse the string back into a MainMessage.

I noticed that when I print the value of m_isGood and its below 128 then I don't get the error and the serialization works. It looks like the encoding for bools in protobuf (set to anything above 127) doesn't play nice when set to some uninitialized value, causing the serialization to fail and hence the deserialization. I also noticed that if I have the Test instance on the stack then it works fine (im guessing because m_isGood is intialized).

Anything else we should know about your project / environment

I compile with CMake for release with optimizations. When I compile for debug I dont have the issue. My guess is that the evaluation of the input is optimized away causing the uninitialized value to be written directly instead of being evaluated.

psypdt avatar Oct 14 '22 09:10 psypdt

Triage update: We are unlikely to have capacity to address this soon, but this seems worth fixing and we would welcome community contribution.

zhangskz avatar Oct 19 '22 16:10 zhangskz

I've never really tried before, but I'll give it a shot and see if I can come up with a solution :)

psypdt avatar Oct 19 '22 18:10 psypdt

I have run the code but I did not get the error. Everything same as what mentioned except I have used protobuf version 3.21.9. Is the problem still valid ? @psypdt @zhangskz

enum-class avatar Oct 27 '22 06:10 enum-class

I too encountered the same issue.

And I noticed discrepancies between debug and release binary builds: debug versions always serialize bools as 0x00 or 0x01, while release versions will serialize bool to the value of the bool-byte, for example 0x80/0xc0/0xf2/0xfe..

This is likely because protobuf's protocol for bools is same to varints, where a leading 1 in a varint indicates more data. Serializing a bool as 0x80 thus leads to deserialization failure.

In the release mode. I test some implementations of serialization. and all bool-byte is equal to 0x80(128)

Current implementation, pb version 3.9.x

//wire_format_lite.h
 inline uint8* WireFormatLite::WriteBoolNoTagToArray(bool value, uint8* target) {
  return io::CodedOutputStream::WriteVarint32ToArray(value ? 1 : 0, target);
}

//io/coded_stream.h
inline uint8* CodedOutputStream::WriteVarint32ToArray(uint32 value,
                                                        uint8* target) {
  while (value >= 0x80) {
    *target = static_cast<uint8>(value | 0x80);
    value >>= 7;
    ++target;
  }
  *target = static_cast<uint8>(value);
  return target + 1; 
}

The expression 'value ? 1 : 0' is likely to be optimized by the compiler, and i printf the value in WriteVarint32ToArray, value is 128. but (value >= 0x80) is FALSE.

I try use volatile rewrite WriteBoolNoTagToArray:

inline uint8* WireFormatLite::WriteBoolNoTagToArray(bool value, uint8* target) {
  volatile int bool_value = value ? 1 : 0; 
  return io::CodedOutputStream::WriteVarint32ToArray(bool_value , target);
}

In WriteVarint32ToArray , the value is still equal to 128(0x08),but now "(value >= 0x80) " is TRUE and Throw Exception:

[libprotobuf FATAL google/protobuf/message_lite.cc:87] CHECK failed: (bytes_produced_by_serialization) == (byte_size_before_serialization): Byte size calculation and serialization were inconsistent.  This may indicate a bug in protocol buffers or it may be caused by concurrent modification of xxxx.
terminate called after throwing an instance of 'google::protobuf::FatalException'

I try to rewrite the WriteBoolNoTagToArray and write a new func io::CodedOutputStream::WriteBoolToArray

//wire_format_lite.h
inline uint8* WireFormatLite::WriteBoolNoTagToArray(bool value, uint8* target) {
  return io::CodedOutputStream::WriteBoolToArray(value, target);
}
 
//io/coded_stream.h
inline uint8* CodedOutputStream::WriteBoolToArray(bool value,
                                                      uint8* target) {
  if (value) {
    *target = static_cast<uint8>(1);
  } else {
    *target = static_cast<uint8>(0);
  }
  return target + 1; 
}

It works! And "*target" is 0x01, not 0x80.

Karkrand avatar Mar 08 '24 07:03 Karkrand