distilabel
distilabel copied to clipboard
[FEATURE] Add `Callable` and `GlobalCallable` that takes custom `callable` as argument
Is your feature request related to a problem? Please describe.
I think column operations for distilabel are not always very intuitive. It might be easier for a user to pass self-defined callable functions instead of defining a CustomStep
, which feels more cumbersome.
Describe the solution you'd like
from distilabel.steps import Callable
def my_function(sample: dict):
del sample["key"]
sample["c"] = sample["a"] + sample["b"]
return sample
Callable(
name="callable",
fn=my_function,
# Assuming something like this is needed for validation
inputs=["key", "a", "b"], #defaults to all
outputs=["c"]
)
VS old options
from distilabel.steps import step
from distilabel.steps.typing import GeneratorStepOutput
@step(outputs=[...], step_type="generator")
def CustomGeneratorStep(offset: int = 0) -> GeneratorStepOutput:
yield (
...,
True if offset == 10 else False,
)
step = CustomGeneratorStep(name="my-step")
or
from distilabel.steps import step
from distilabel.steps.typing import GeneratorStepOutput
@step(outputs=[...], step_type="generator")
def CustomGeneratorStep(offset: int = 0) -> GeneratorStepOutput:
yield (
...,
True if offset == 10 else False,
)
step = CustomGeneratorStep(name="my-step")
Describe alternatives you've considered Custom Steps and Tasks
Additional context https://discord.com/channels/879548962464493619/1217729625401196574/1265730218505539686