protobuf icon indicating copy to clipboard operation
protobuf copied to clipboard

Null character when serializeToJsonString is called on message

Open zajca opened this issue 2 years ago • 2 comments

I found weird behavior, when I serialize message to json at the end there is null character \u0000. I can't figure out why this is happening, in the end I'm sending this to temporal and deserializing on worker where it fails with Cannot decode json string. I've made small example script in my repo which just call mergeFromJsonString on valid json string and then call serializeToJsonString on resulting message. After valid json there is unexpected null character.

If I run again mergeFromJson string on result I get GPBDecodeException: Error occurred during parsing: Cannot decode json string

What version of protobuf and what language are you using?

Protoc 3.20.2 PHP 7.4 google/protobuf 3.21.6

What operating system (Linux, Windows, ...) and version? Linux

What runtime / compiler are you using (e.g., python version or gcc version) PHP 7.4

What did you do? Steps to reproduce the behavior:

I have reproduce this here: https://github.com/keboola/php-storage-driver-teradata/blob/zajca-kbc-2874/test.php this is only object where encountered this https://github.com/keboola/php-storage-driver-teradata/blob/zajca-kbc-2874/proto/table.proto#L199

git clone https://github.com/keboola/php-storage-driver-teradata.git test
cd test
git checkout zajca-kbc-2874
docker build -t test -f min.Dockerfile .
docker run test php test.php

What did you expect to see

Valid json.

'{"credentials":{"@type":"type.googleapis.com/keboola.storageDriver.credentials.GenericBackendCredentials","host":"xxx","principal":"xxx","secret":"xxx","meta":{"@type":"type.googleapis.com/keboola.storageDriver.credentials.GenericBackendCredentials.TeradataCredentialsMeta","database":"xxx"}},"command":{"@type":"type.googleapis.com/keboola.storageDriver.command.table.TableImportFromFileCommand","formatTypeOptions":{"@type":"type.googleapis.com/keboola.storageDriver.command.table.TableImportFromFileCommand.CsvTypeOptions","columnsNames":["id","name","city","sex"],"delimiter":",","enclosure":"\\"","compression":"GZIP"},"filePath":{"root":"xxx","path":"xxx","fileName":"xxx"},"fileCredentials":{"@type":"type.googleapis.com/keboola.storageDriver.command.table.ImportExportShared.S3Credentials","key":"xxx","secret":"xxx","region":"eu-central-1"},"destination":{"path":["xxx"],"tableName":"xxx"},"importOptions":{"timestampColumn":"_timestamp","numberOfIgnoredLines":1,"dedupType":"INSERT_DUPLICATES"},"meta":{"@type":"type.googleapis.com/keboola.storageDriver.command.table.TableImportFromFileCommand.TeradataTableImportMeta"}}}'

What did you see instead?

json with extra unexpected characters ...}. "\0" . '' at the end which makes json invalid

'{"credentials":{"@type":"type.googleapis.com/keboola.storageDriver.credentials.GenericBackendCredentials","host":"xxx","principal":"xxx","secret":"xxx","meta":{"@type":"type.googleapis.com/keboola.storageDriver.credentials.GenericBackendCredentials.TeradataCredentialsMeta","database":"xxx"}},"command":{"@type":"type.googleapis.com/keboola.storageDriver.command.table.TableImportFromFileCommand","formatTypeOptions":{"@type":"type.googleapis.com/keboola.storageDriver.command.table.TableImportFromFileCommand.CsvTypeOptions","columnsNames":["id","name","city","sex"],"delimiter":",","enclosure":"\\"","compression":"GZIP"},"filePath":{"root":"xxx","path":"xxx","fileName":"xxx"},"fileCredentials":{"@type":"type.googleapis.com/keboola.storageDriver.command.table.ImportExportShared.S3Credentials","key":"xxx","secret":"xxx","region":"eu-central-1"},"destination":{"path":["xxx"],"tableName":"xxx"},"importOptions":{"timestampColumn":"_timestamp","numberOfIgnoredLines":1,"dedupType":"INSERT_DUPLICATES"},"meta":{"@type":"type.googleapis.com/keboola.storageDriver.command.table.TableImportFromFileCommand.TeradataTableImportMeta"}}}' . "\0" . ''

This is what I see in debugger in my app: image

Anything else we should know about your project / environment

I did find out when desalinizing messages in temporal, but problem is somewhere in php protobuf serialization, which proves test.php ^

This looks like that jsonByteSize is wrong and there is one extra byte in buffer image

zajca avatar Sep 15 '22 19:09 zajca

I have made some debuging and it looks like problem is somewhere in fileCredentials field

        "fileCredentials": {
            "@type": "type.googleapis.com/keboola.storageDriver.command.table.ImportExportShared.S3Credentials"
        }

When I remove this type from test json it will be ok. I have updated test file.

zajca avatar Sep 16 '22 08:09 zajca

It looks like problem is serialization of any.

I have message:

message TEST {
  string val = 1;
}

I can do:

$m = new TEST();
$json = $m->serializeToJsonString();
$m = new TEST();
$m->mergeFromJsonString($json);

// so far OK

$any = new \Google\Protobuf\Any();
$any->pack($m);

$json = $any->serializeToJsonString();
var_export($json); // -> '{"@type":"type.googleapis.com/TEST"}' . "\0" . '
$any = new \Google\Protobuf\Any();
$any->mergeFromJsonString($json); // -> fail  Uncaught Google\Protobuf\Internal\GPBDecodeException: Error occurred during parsing: Cannot decode json string

it looks like problem is serialization with any where string filed is not set. Same error will occur if I mark field as optional.

zajca avatar Sep 16 '22 09:09 zajca

Fixed in https://github.com/protocolbuffers/protobuf/pull/10595

zajca avatar Oct 11 '22 09:10 zajca