DataSets.jl icon indicating copy to clipboard operation
DataSets.jl copied to clipboard

Iteration of `BlobTree` — `pairs` or `values`? `basename`?

Open c42f opened this issue 2 years ago • 0 comments

BlobTree currently iterates values for much the same reasoning that Dictionaries.jl does (broadcast support, etc). For cases where the name is important this might seem inconvenient. But individual Blob or BlobTree elements in the iteration also know their names via basename which makes it possible to extract names where necessary. To resolve this it's probably best to translate some examples of data processing code to the BlobTree API and see whether pairs() or values() is desired for iteration.

As a side note, having values know their own keys via basename is quite an oddity for a dictionary-like datastructure. Is this a problem in itself? Generally, Blob is only a lazy reference to data held outside Julia. An individual Blob object presumably won't be an element of two different BlobTrees — it needs to cause a copy in the background. Perhaps this isn't so bad, then.

c42f avatar Apr 28 '22 07:04 c42f