JSON3.jl
JSON3.jl copied to clipboard
Mapping subtypes via object keys
I'm having issues identifying a sane way to manage an input dataset.
One object expects multiple types of subobjects, for this MWE we'll choose two simple ones:
flatcron = """
{
"cron" : {
"repo": "TestCron"
}
}
"""
flatpfs = """
{
"pfs" : {
"repo": "TestPFS",
"glob": "*"
}
}
"""
I can parse these with
struct PFS
repo::String
glob::String
end
struct Cron
repo::String
end
struct Input
pfs::Union{Nothing, PFS}
cron::Union{Nothing, Cron}
end
StructTypes.StructType(::Type{Input}) = StructTypes.Struct()
StructTypes.StructType(::Type{PFS}) = StructTypes.Struct()
StructTypes.StructType(::Type{Cron}) = StructTypes.Struct()
but I know that I'll always have one of these options and never more. So I'd like to move to something like
abstract type InputFlat end
struct PFSFlat <: InputFlat
repo::String
glob::String
end
struct CronFlat <: InputFlat
repo::String
end
StructTypes.StructType(::Type{InputFlat}) = StructTypes.AbstractType()
StructTypes.StructType(::Type{PFSFlat}) = StructTypes.Struct()
StructTypes.StructType(::Type{CronFlat}) = StructTypes.Struct()
StructTypes.subtypes(::Type{InputFlat}) = (pfs=PFSFlat, cron=CronFlat)
This of course fails, as I'm missing the subtypekey
for InputFlat
, since I have multiple (i.e pfs
and cron
)... Is there a way to map these values such that
cron_parse = JSON3.read(flatcron, InputFlat)
cron_parse.repo # TestCron
pfs_parse = JSON3.read(flatpfs, InputFlat)
pfs_parse.glob # *
Hmmm....this is tricky. Been thinking through various solutions here. The use of subtypekey
is slightly different in the designed case, however, since the value of the subtypekey is used to figure out the type to parse, yet in this case, the different key would inform which type would be parsed.
I'll have to think a bit further, but one idea is that we might allow passing a function to subtypekey
that would be called on the keys being parsed and could return a type to be parsed.
Thanks.
I've had a bit of a think about it myself and I guess what I'm asking isn't a two way function either. Assuming you'd want to keep the serialisation and deserialisation results the same—something like this is an easy way to break things.
For the moment, I'm just using a functor on the problem struct:
function (input::Input)()
for field in fieldnames(typeof(input))
data = getfield(input, field)
isnothing(data) || return data
end
end
I am also facing a similar issue. I have a JSON object that contains two fields: a type field and a data field, like so:
{
"type": "20210101:temp",
"data": {
"date": 20210101,
"min": "19.5",
"max": "23.5"
}
}
The trouble is that I have to process the type field first to extract the correct type for data. In this example I have to remove date from the type field to get the correct type. Because of this it is impossible to use AbstractStructs in this scenario. It would be awesome if it was possible to pass a function to subtypekey
that returns the correct type and not just a NamedTuple.
Rust solves this problem in a different way. It allows not only 'tagged' structures (just like we have here with subtypekey
), but also 'untagged': https://serde.rs/enum-representations.html.
There is no explicit tag identifying which variant the data contains.
Serde will try to match the data against each variant in order and
the first one that deserializes successfully is the one returned.
I recently worked with this approach in rust and I find it quite cool, actually.
In Julia, since we have a predefined set of types to choose from (subtypes of an AbstractType, or passed directly as function arguments), I guess, we can always do this efficiently on-the-fly in one pass of the parsing (by "eliminating" types that do not match).
It would require to dig very deep into existing code to implement it, though.