YAML.jl icon indicating copy to clipboard operation
YAML.jl copied to clipboard

Duplicate keys in mapping

Open sstroemer opened this issue 1 year ago • 1 comments

As of the YAML specs, keys of mappings need to be unique: "The content of a mapping node is an unordered set of key/value node pairs, with the restriction that each of the keys is unique."

This is not accounted for, which leads to duplicate entries silently being overwritten:

YAML.load("x: 3")         # Dict{Any, Any} with 1 entry: "x" => 3
YAML.load("x: 3\nx: 4")   # Dict{Any, Any} with 1 entry: "x" => 4     (<-- this is wrong)

A bandaid fix (if anyone stumbles on this) would be:

@kwdef struct UniqueKeyDict{T1, T2} <: AbstractDict{T1, T2}
    dict::Dict{T1, T2} = Dict{T1, T2}()
end

function Base.setindex!(ukd::UniqueKeyDict, value, key)
    haskey(ukd.dict, key) && error("Key $key already exists in dictionary.")
    ukd.dict[key] = value
end

YAML.load("x: 3")                                               # Dict{Any, Any} with 1 entry: "x" => 3
YAML.load("x: 3\nx: 4")                                         # Dict{Any, Any} with 1 entry: "x" => 4
YAML.load("x: 3"; dicttype=UniqueKeyDict{Any, Any}).dict        # Dict{Any, Any} with 1 entry: "x" => 3
YAML.load("x: 3\nx: 4"; dicttype=UniqueKeyDict{Any, Any}).dict  # throws an error

Note that the internal dict field can then be used for further processing. Similar workarounds are of course valid for OrderedDict or similar types.

sstroemer avatar May 28 '24 18:05 sstroemer

Using insert! from Dictionaries.jl is how I would do it (though to be clear, I'm not advocating adding that dependency)

kescobo avatar May 29 '24 13:05 kescobo