distilabel icon indicating copy to clipboard operation
distilabel copied to clipboard

[FEATURE] Add `options` within `generation_kwargs` for `InferenceEndpointsLLM`

Open alvarobartt opened this issue 10 months ago • 0 comments

Is your feature request related to a problem? Please describe.

Inference Endpoints will by default use the cache unless explicitly specified otherwise, so we should add a flag to control that and disable it, as in the cases where we use num_generations is discouraged and makes no sense to use the cache, since all the generations will be equal.

Describe the solution you'd like

Align the existing generation kwargs in distilabel for InferenceEndpointsLLM with the ones offered by the huggingface_hub.InferenceClient for Inference Endpoints.

Additional context

See their docs and {"options": {"use_cache": false}} at https://huggingface.co/docs/api-inference/detailed_parameters#text-generation-task

alvarobartt avatar Apr 24 '24 10:04 alvarobartt