Double quoted keys for custom struct json write
I noticed that for structs with custom json write, objects when used as keys are double quoted. For instance, in json_test, custom_struct fails with this test:
using custom_struct_map = std::map<custom_struct, bool>;
custom_struct_map obj_map{{custom_struct{"hello"}, true}, {custom_struct{"world"}, false}};
expect(not glz::write_json(obj_map, s));
expect(s == R"({"hello":true,"world":false})");
In json/write.hpp, when we write a pair of key value for custom write which calls string type of write, the condition for this if is false so quoted_t is added in a string which will itself be quoted as well, resulting in double quoted string:
template <opts Opts, class Key, class Value, is_context Ctx>
GLZ_ALWAYS_INLINE void write_pair_content(const Key& key, Value&& value, Ctx& ctx, auto&&... args)
{
if constexpr (str_t<Key> || char_t<Key> || glaze_enum_t<Key> || Opts.quoted_num) {
write<JSON>::op<Opts>(key, ctx, args...);
}
else {
write<JSON>::op<opt_false<Opts, &opts::raw_string>>(quoted_t<const Key>{key}, ctx, args...);
}
In my application, I try to guess at the context by scanning the buffer for previous ", : or [ for instance to guess if need to add quotes or not, but it's not very practical.
I tried to figure out how in client code we could identify that we are writing for a key or a value. In Json, keys should always be quoted, but it's not true for values such as bool / int for instance. Maybe one solution could be to give additional hint in the context that we are writing a key or value in order for it to know if it needs to add the quotes or not.
WDYT ?
It seems like you are not wanting the custom_struct to be the key, but rather for the maps to be merged. You can get close to what you want with using custom_struct_map = std::vector<std::pair<custom_struct, bool>>;. One way of merging is to use a vector of pairs. However, this is still double quoting the keys, so I need to look into this issue in more detail.
Boy, this is tricky. I got a firm grasp on the problem now. You were write to put out the quoted_t logic, which is being applied to the custom serialization.
Normally Glaze tries to identify the type of JSON output based on the C++ type. In this case the C++ type is custom and therefore we can't draw conclusions about what type of JSON output it will generate. This seems to imply that we ought to have a way of tagging the type of the JSON that is being produced by our custom serializer. The tricky thing is that this creates two paths to JSON type handling, the path for the C++ concepts that make the value readable and writable in JSON (such as operator[key] for maps) and the path that indicates what kind of data is serialized separate from the C++ type. It would be a bad idea to look at the serialized data to determine the type (inefficient and complex). So, we need a way of tagging the type. But, custom serialization is intentionally built around C++ concepts.
I think it is best to use the glz::meta approach with glz::custom, because this separates the type metadata from the custom serialization. Within the glz::meta we already have terms like custom_write, but this is only to avoid ambiguous partial specialization. What we want is to be able to tag the glz::meta with something like string_type to indicate that a JSON string will be serialized.
If your custom_type just didn't add the quotes, there wouldn't be a problem, but I can understand how the quotes might be desirable in different contexts.
I think supporting glz::meta with type tagging would be a valid solution, but there is also the question if it should be another specialization, such as glz::indicate_type<custom_struct, glz::string_type>. If we used a specialization other than glz::meta then we could use it in conjunction with the to/from specializations and not have to rework the custom serialization into glz::custom.
I need to give this more thought, but I think a general solution would be broadly useful.
Sorry about that :)
Would it be possible to authorize custom json write for strings only and adapt glaze code in write pair content accordingly? This way, we know that we always need to quote the value in glaze code, and client code in to<JSON override does not need to care about adding more quotes (they will be already added, if it's a key or a value). The code to fix the issue would not be so complicated nor expensive in terms of processing.
However, I agree that it would be a step back from the model you provided, that we could write for int / bool / float values. I think that the need to have custom json writing for non string values is quite rare anyway.
I think an elegant approach might be to define a C++ type with known format association to indicate how this custom type will serialize data. Maybe it should be named something like mimic_type?
template <>
struct glz::indicate_type<custom_type>
{
using type = std::string; // indicates that the format will appear like a std::string, but with custom read/write
};
I realize this can exist inside of the glz::meta, because glz::meta won't collide with a custom to/from if it only specifies a type behavior.
Hence I'm think of using the syntax:
template <>
struct glz::meta<custom_type>
{
using mimic = std::string;
};
I feel like mimic is better than type here, but I'm still thinking through how this should be named.
I like your solution of a mimic type. But I would rather see a value from an enum class instead, to avoid ambiguity for instance for strings (std::string_view, std::string). This enum would define the basic json types like string, integer, float and boolean. Actually we also need to take care about the null special value, which should not be quoted.
I can see merits of an enum class. I'm thinking about how this applies to other specifications beyond JSON. But, I think these could be special enumerations for different use cases.
I think I agree and I propose:
template <>
struct glz::meta<custom_type>
{
static constexpr auto mimic = mimic_type::string;
};
Is this issue resolved?
Aspects of this issue have been resolved and pair writing has been improved, but the mimicking proposal outlined here for custom types has not yet been implemented.
Hi! I just started using glaze, and amazed by the ease of use of serialization and deserialization of structs with no boilerplate! Thanks a lot!
I'm using http://github.com/gershnik/modern-uuid as my UUID library. And it has simple translators to and from std::string and I thought I can easily imitate your https://github.com/stephenberry/glaze/blob/main/docs/custom-serialization.md#uuid-example example.
I've added following to function in glz namespace.
template <>
struct to<JSON, muuid::uuid> {
template <auto Opts>
static void op(const muuid::uuid& uuid, auto&&... args) noexcept {
std::string str = uuid.to_string();
serialize<JSON>::op<Opts>(str, args...);
}
};
And this made my struct that contains the UUID serializable by glaze!
However, I'm experiencing the problem described in this issue. please see my serialized JSON
{
"entries": {
"\"6b7e1504-5a58-50d2-bded-0041d67ad406\"": {
"type": "ShaderStage",
"srcPath": "shaders/unlit.vert",
"dstPath": "processed/77a8b13e.vert.shaderStage.beve"
},
"\"0af88fd5-e77a-5f37-b625-057e49b958ad\"": {
"type": "ShaderStage",
"srcPath": "shaders/lit.vert",
"dstPath": "processed/f4f2a7a8.vert.shaderStage.beve"
}
},
"aliases": {
"shaders/unlit": "6b7e1504-5a58-50d2-bded-0041d67ad406",
"shaders/lit": "0af88fd5-e77a-5f37-b625-057e49b958ad"
}
}
from following C++ schema:
using AssetUuid = muuid::uuid;
struct AssetEntry {
DefinitionType type;
std::filesystem::path srcPath;
std::filesystem::path dstPath;
std::optional<std::string> subAssetName;
std::optional<std::vector<AssetUuid>> dependencies;
};
struct AssetRegistry {
std::unordered_map<AssetUuid, AssetEntry> entries;
std::unordered_map<std::string, AssetUuid> aliases;
};
As you can see when the UUID is used as a value in a map it looks fine, however, when it is used as a key it acquires these extra double quotes.
I was wondering whether you can suggest me a workaround before a more streamlined solution is implemented? Maybe there is a lower-level way of serialization, lower than serialize<JSON>::op<Opts>(str, args...); with which I can prevent the extra double quotes?
Thanks!
I dug in glaze code via debugger and observed that the execution comes here: glaze/include/glaze/json/write.hpp at main · stephenberry/glaze
template <auto Opts, class Key, class Value, is_context Ctx, class B>
GLZ_ALWAYS_INLINE void write_pair_content(const Key& key, Value&& value, Ctx& ctx, B&& b, auto&& ix)
{
if constexpr (str_t<Key> || char_t<Key> || glaze_enum_t<Key> || Opts.quoted_num) {
to<JSON, core_t<Key>>::template op<Opts>(key, ctx, b, ix);
}
else if constexpr (num_t<Key>) {
serialize<JSON>::op<opt_true<Opts, &opts::quoted_num>>(key, ctx, b, ix);
}
else {
serialize<JSON>::op<opt_false<Opts, &opts::raw_string>>(quoted_t<const Key>{key}, ctx, b, ix);
}
and with regular strings it goes through the first if case and when used in a key it's fine, but with modern uuid it goes through the else case and gets extra quotes in keys. So, I thought, if I can make modern uuid's uuid class a str_t then it can also enjoy the correct serialization.
Here glaze/include/glaze/core/common.hpp at main · stephenberry/glaze I saw that str_t is a concept! (first time encountering a C++ concept in the wild!)
template <class T>
concept str_t = (!std::same_as<std::nullptr_t, T> && std::constructible_from<std::string_view, std::decay_t<T>>) ||
array_char_t<T>;
With minimal intrusion, and without changing the modern-uuid, or glaze codebase I've introduced following wrapper class: glaze_uuid that can be converted to string_view.
namespace aur {
struct glaze_uuid : muuid::uuid {
using muuid::uuid::uuid;
operator std::string_view() const {
thread_local std::string_view buffer;
buffer = this->to_string();
return buffer;
}
glaze_uuid(const muuid::uuid& u)
: muuid::uuid(u) {}
// shouldn't be needed
// operator muuid::uuid() const { return *this; }
};
} // namespace aur
namespace std {
template <>
struct hash<aur::glaze_uuid> {
size_t operator()(const aur::glaze_uuid& uuid) const noexcept { return std::hash<muuid::uuid>()(uuid); }
};
} // namespace std
namespace glz {
template <>
struct to<JSON, aur::glaze_uuid> {
template <auto Opts>
static void op(const aur::glaze_uuid& uuid, auto&&... args) noexcept {
std::string str = uuid.to_string();
serialize<JSON>::op<Opts>(str, args...);
// glz::write<glz::raw_json>(Opts(args...), str);
}
};
} // namespace glz
and used this new glaze_uuid in my registry
struct AssetRegistry {
std::unordered_map<AssetUuid, AssetEntry> entries;
std::unordered_map<std::string, AssetUuid> aliases;
};
now I don't have the extra double quotes in keys! ^_^
I think for the rest of the code-base I can continue using the original muuid::uuid and just when serializing and deserializing I can use glaze_uuid.
I'm leaving this here for people who'll also encounter the same issue, just in case.
However, if there is a better method, please let me know ^_^
@vug, thanks for sharing your workaround! This is a smart solution until we get type mimicking implemented.
I do think it is best to work with C++ concepts to achieve type mimicry with wrappers, which is what you've done here. But, there are probably still use cases where it is cleaner to just tell the compiler to treat some type like a string type for output.
Just encountered the same thing with Qt QString. Having a built-in way to deal with these types of issues would be great. So it is a +1 from me as well.
I've ended up using @vug work around for this but also added the std::size_t size() const and const char* data() const methods to my custom ID type (which I was already using) in order to satisfy beve.