[#405] Convert object to/from JSON
The field IDs for Instant and Date were ignored during the conversion so the generated JSON was an empty string. The fix was a bit more convoluted. I'll explain all I have and maybe there's a better solution.
- An
Instantis converted by theWrappedMessagetobyte[]by writing oneint64field for seconds andint32for nanos. - The Proto to JSON conversion needs to read two fields now. The
message-wrapping.protofile describes the seconds part as aoneOfwhich writes the key as"_value". We still need to write to fields, so we end with{"_value": 0, "wrappedInstantNanos": 0}without any typing. - We could use the type as a
WrappedMessage, but the conversion from JSON toInstantwould fail as it creates theInstantinside theWrappedMessageobject. So we need an explicit type for the conversion to create the correct object.
Since we need this conversion from JSON to Instant to work, I've created an adapter in the types module. During the Proto to JSON conversion, we require this adapter to be registered so we can write the field names and the type correctly.
We are still left with converting JSON to Proto byte[]. We need to verify in the WrappedMessage class whether this object corresponds to a known type. Otherwise, the generated byte[] will be entirely different from the one produced by Instant to Proto byte[] serialization.
The Date object suffers from the same thing. The difference is that:
- We serialize Date as an int64 from the
Date#getTimemethod. - This would create a JSON as
{"_type": "int64", "_value": 12345}. Which loses any type parameter that this value is aDateobject. - The conversion from JSON to
Datewill fail as it returns a primitivelongnumber instead of theDateobject.
Closes #405.
Directly wrapping objects with the WrappedMessage leads to:
// new WrappedMessage(new WrappedMessage(UUID.randomUUID()))
{
"_type":"org.infinispan.protostream.WrappedMessage",
"_value":{
"_type":"org.infinispan.protostream.WrappedMessage",
"_value":{
"_type":"org.infinispan.protostream.commons.UUID",
"mostSigBitsFixed":15461222357418722979,
"leastSigBitsFixed":13070220576097665502
}
}
}
Using with collections:
// List.of(new WrappedMessage(UUID.randomUUID()), new WrappedMessage(UUID.randomUUID()))
{
"_type":"collections.ListOfAdapter",
"elements":[
{
"_type":"org.infinispan.protostream.WrappedMessage",
"_value":{
"_type":"org.infinispan.protostream.commons.UUID",
"mostSigBitsFixed":1624897274251135499,
"leastSigBitsFixed":11937877021784957987
}
},
{
"_type":"org.infinispan.protostream.WrappedMessage",
"_value":{
"_type":"org.infinispan.protostream.commons.UUID",
"mostSigBitsFixed":12908692626181016473,
"leastSigBitsFixed":9445129632214002810
}
}
]
}
I've also removed the use of _value by default in fields defined in the message-wrapping.proto:
// new WrappedMessage(1.23f)
{
"_type":"org.infinispan.protostream.WrappedMessage",
"wrappedFloat":1.23
}
I'll revert this to a draft. There are some corner cases with more nested objects, collections, and adapters that don't create exact same byte array for the conversion of JSON -> Proto.
And by looking at the test failures, the _value field is expected in several places :/
Some containers were completely ignored by the JSON serialization (ArrayList, HashSet). I'll work to add those as well.
Some time later, more than I would like. I've refactored the old code. Instead of doing (parse, write, prettify, etc.) spread all around, I've created specialized classes for each type we would expect in the byte stream: primitives, nested objects, repeated fields, maps, and containers.
This uses the specialized classes with internal delegates, so each class can focus on doing its part and passing the request along. For example, a list of UUIDs: there is the root parser, which passes the request to an object writer for the adapter, which delegates to an array writer, which delegates each UUID to an object writer.
The writers will pass the context along so we can track where we are in the JSON. Leading to the other change. Instead of passing a StringBuilder and writing, we tokenize the byte stream by pushing tokens from the JSON specification into a list. After reading the stream, we write the tokens in order and create the final JSON.
The parsing of JSON to byte array remains more or less the same as before, plus fixes to handle more types. This needed changes to go over again from the start when parsing a document, and we need to pass along which field number it belongs. The downside here is that we go to the bottom of the JSON and build up writing in a nested buffer. This has some extra allocations in some places when we need to wrap a byte array with WRAPPED_MESSAGE and the field.
The changes here are pretty much backwards compatible with what we had before. A JSON written in a previous version should be decoded just fine. Observe that we only changed the expected JSON in a single test: a test that uses an ArrayList container. Everything else is the same.
@jabolina would this allow to push json from the client and convert to / from protostream ? in hotrod I mean
@karesti, correct. A sample application doing that with the Hot Rod client:
public class EncodingRemoteCache {
private static final String CACHE_NAME = "test-cache";
private static void registerMagazineSchemaInTheServer(RemoteCacheManager cacheManager, GeneratedSchema schema) throws IOException {
// Retrieve metadata cache
RemoteCache<String, String> metadataCache =
cacheManager.getCache(PROTOBUF_METADATA_CACHE_NAME);
// Define the new schema on the server too
metadataCache.put(schema.getProtoFileName(), schema.getProtoFile());
}
public static ConfigurationBuilder connectionConfig() {
ConfigurationBuilder builder = new ConfigurationBuilder();
builder.addServer()
.host("127.0.0.1").port(11222)
.security()
.authentication()
.username("admin")
.password("password");
String configuration = """
localCache:
name: "test-cache"
statistics: "true"
encoding:
mediaType: "application/x-protostream"
""";
builder.remoteCache(CACHE_NAME).configuration(configuration);
return builder;
}
public static void main(String[] args) throws IOException {
GeneratedSchema userSchema = new UserModelSerializationContextInitializerImpl();
ProtoStreamMarshaller marshaller = new ProtoStreamMarshaller();
marshaller.register(userSchema);
ConfigurationBuilder cb = connectionConfig();
cb.marshaller(marshaller);
RemoteCacheManager rcm = new RemoteCacheManager(cb.build());
registerMagazineSchemaInTheServer(rcm, userSchema);
RemoteCache<String, UserModel> userCache = rcm.getCache(CACHE_NAME);
RemoteCache<String, String> jsonCache = rcm.getCache(CACHE_NAME)
.withDataFormat(DataFormat.builder()
.valueType(MediaType.APPLICATION_JSON)
.valueMarshaller(new UTF8StringMarshaller())
.build());
UserModel user = new UserModel("jabolina", "jose", 28, Instant.EPOCH);
userCache.put("jose", user);
System.out.println("GET PROTO: " + userCache.get("jose"));
System.out.println("GET JSON: " + jsonCache.get("jose"));
rcm.stop();
}
@Proto
public record UserModel(String username, String name, int age, Instant creation) {
@ProtoSchema(
syntax = ProtoSyntax.PROTO3,
dependsOn = {
CommonContainerTypes.class,
},
includeClasses = UserModel.class,
schemaFileName = "user.proto",
schemaFilePath = "proto/",
schemaPackageName = "user"
)
public interface UserModelSerializationContextInitializer extends GeneratedSchema {}
}
}
The output here for the system out is:
GET PROTO: UserModel[username=jabolina, name=jose, age=28, creation=1970-01-01T00:00:00Z]
GET JSON: {"_type":"user.UserModel","username":"jabolina","name":"jose","age":28,"creation":0}
I believe the marshaller here is only necessary for the put operation. The server is the one doing the work of transforming the model to JSON. In theory, the Hot Rod client doesn't need to know about the Proto descriptor.
Also, the changes in the PR already exist in previous versions of ProtoStream. The current PR is only refactoring that part.
@jabolina I believed it was something else, like really posting json JSONObject for example, and getting it converted!
Thanks @jabolina