distilabel icon indicating copy to clipboard operation
distilabel copied to clipboard

[FEATURE] Add more flexibility to generate structured data from multiple schemas

Open plaguss opened this issue 9 months ago • 0 comments

Is your feature request related to a problem? Please describe. The implementation of the structured outputs in #601 only allows for a single structure (json schema, BaseModel in case of json) for the whole dataset. We should relax this.

Describe the solution you'd like I would like to use a field in the structured_output to select a column from the dataset where we can grab the schema so that we have more flexibility when generating datasets of structured data.

Describe alternatives you've considered A clear and concise description of any alternative solutions or features you've considered.

Additional context Work after merging #601.

plaguss avatar May 10 '24 13:05 plaguss