distilabel
distilabel copied to clipboard
[FEATURE] Add more flexibility to generate structured data from multiple schemas
Is your feature request related to a problem? Please describe.
The implementation of the structured outputs in #601 only allows for a single structure (json schema, BaseModel
in case of json) for the whole dataset. We should relax this.
Describe the solution you'd like
I would like to use a field in the structured_output
to select a column from the dataset where we can grab the schema so that we have more flexibility when generating datasets of structured data.
Describe alternatives you've considered A clear and concise description of any alternative solutions or features you've considered.
Additional context Work after merging #601.