unitxt icon indicating copy to clipboard operation
unitxt copied to clipboard

Allow selection of demonstrations in ICL based on similarity to current instance

Open yoavkatz opened this issue 1 year ago • 0 comments

Currently demo examples are selected from the demo pool (which is typically extracted from the train set).

The selection of demos per instance is random by default, or more complex (e.g. selection of demos with diverse labels).

This is done via Sampler object (for example: sampler=DiverseLabelsSampler(choices="class", labels="label"))

We have multiple requests to allow selection of demos based on similarity to the current instance.

This means the Sampler.sample method should also receive the current instance:

@abstractmethod def sample( self, instances_pool: List[Dict[str, object]], current_instance: List[Dict[str, object]] ) -> List[Dict[str, object]]: pass

Then we can create different samplers like

EditDistanceSimilaritySampler SentenceBertSimilaritySampler

that would return the most similar instances to the current instance.

yoavkatz avatar Jul 18 '24 07:07 yoavkatz