distilabel icon indicating copy to clipboard operation
distilabel copied to clipboard

[FEATURE] Do not pass rows that contains `Step.inputs` with `None` values

Open gabrielmbmb opened this issue 1 year ago • 0 comments
trafficstars

Is your feature request related to a problem? Please describe. Let's say that I have a pipeline like:

TextGeneration() >> ProcessGeneration()

TextGeneration will add a generation to the output dictionary, but if the LLM fails the generation will contain None.ProcessGeneration.inputs are ["generation"], but as TextGeneration can produce None values, ProcessGeneration will have to handle None values. This applies to all the Steps, which is a bit cumbersome.

Describe the solution you'd like _StepWrapper could check the inputs columns of the Step and check that the input dictionaries contain non-None values for these keys (Step.inputs). If they do, then these dictionaries are not passed to the Step and imputed.

gabrielmbmb avatar Oct 12 '24 18:10 gabrielmbmb