unitxt
unitxt copied to clipboard
Demos sampling per instance is not consistent for shuffled stream
Currently the demos for every instance are sampled based on the inital seed and its order in the stream, instaed of based solely on the content of the instance. This has the benefit that when instance changes it still gets the same demos but if you run subset of the stream in or the stream in a different order you may get different demos for every instance.
If the instance is allowed to change, then who it is? What is the entity that is eligible to always receive the same demos? https://en.wikipedia.org/wiki/Ship_of_Theseus Perhaps we may want to add a unique id to each instance upon its entrance to its journey through the recipe (right after loading)?