go-ipld-prime
go-ipld-prime copied to clipboard
Get Keys of a map without realizing values
Currently, the access pattern API provided for maps is to create a MapIterator
which iterates through the items of the map.
It would be useful to be able to iterate through the keys of the map, without having to allocate and potentially construct the corresponding value ipld.Node objects for each.
Could the same apply to keys? For example, if I only wanted the values, would I save work by not realizing/building the key nodes?
If so, then I'd suggest to put both ipld.Node
values behind functions, like:
type MapIterator interface {
Next() (KeyValue, error)
[...]
}
type KeyValue interface {
Key() Node
Value() Node
}
As long as we say that the returned KeyValue
is only valid until the next iteration, I don't think this will force any extra allocations, becacuse we can simply reuse a single stack-allocated KeyValue
implementation.
I'm mostly ignoring error handling above - I assume that the Key
and Value
methods would need to return errors, and perhaps Next
could never return an error and simply panic if it's called once Done
returns true.
I'm mostly picking up a couple of ideas from https://pkg.go.dev/github.com/dgraph-io/badger/v2#Iterator.
A KeyIterator()
and ValuesIterator()
would be nice, it would match a common pattern used in collections across the programming landscape
I've wondered if we'd be able to get away with not needing this, but it seems like the consensus is no, we need it :) So, okay.
Agree KeyIterator()
and ValuesIterator()
sound like one valid option.
Agree MapIterator{ Next() KV }
with delayed and separated key and value extraction is another option. (Although syntactically somewhat nasty to use, perhaps; would take multiple lines of user code to fully destructure that, which is notably burdensome.)
I haven't thought on this for long enough to form a preference between those two either way yet.
The other concern to balance is probably binary and generated-source (or boilerplate, if a node is handwritten) size for the approach we take. Assuming we add these features to the central ipld.Node
interface (or one of the other interfaces like ipld.MapIterator
that's also directly connected to ipld.Node
), we force the need to implement this on every Node
implementation everywhere.
To me, the advantage of keeping a single iterator instead of splitting into KeyIterator
and ValueIterator
is that at each iteration one can decide what nodes need to be obtained. For example, if I only need to obtain the value node for a small subset of the keys.
If one needs both keys and values, it's also unclear how one would use KeyIterator
and ValueIterator
at once.
Well to me, a major advantage of special KeyIterator()
and ValueIterator()
is that they become interchangeable with the iterator obtained from a List type. Which should introduce some interesting consumption interface patterns. Having a weird "only realised if you touch it" model means that everything in here is special-case.