distilabel icon indicating copy to clipboard operation
distilabel copied to clipboard

Distilabel is a framework for synthetic data and AI feedback for engineers who need fast, reliable and scalable pipelines based on verified research papers.

Results 168 distilabel issues
Sort by recently updated
recently updated
newest added

## Description In `distilabel` v1.0.0 we included the task `TextGeneration` and at the last moment we decided to include support for an already formatted chat like object i.e. a list...

deprecation

**Is your feature request related to a problem? Please describe.** The implementation of the structured outputs in #601 only allows for a single structure (json schema, `BaseModel` in case of...

enhancement

## Description Provide a `distilabel` docker image that can be used to execute a pipeline within. This is useful for executing `distilabel` pipelines on Cloud providers with serverless solutions. It...

enhancement

just wanted to know how to use mistral api,, im a newbiw

**Is your feature request related to a problem? Please describe.** Providing runtime parameters using option `--param` of `distilabel run` can be cumbersome. **Describe the solution you'd like** Add a `--runtime-parameters-path`...

enhancement
good first issue

Closes #890 ![image](https://github.com/user-attachments/assets/30220987-6e3a-4599-8b49-95ef8c82869f)

documentation

**Describe the bug** I want to use Ultrafeedback task in a pipeline, but i have already the dataset, so the pipeline include only loading the dataset and after pass it...

## Description This PR implements cache at step level. Previously, we computed a signature for a pipeline, and when this signature changed, we recomputed everything. Now the idea is to...

improvement

## Description ⚠️ Work in progress This PR improves the `FormatTextGenerationSFT` task to allow preparing fine tuning datasets with function calling.

documentation
enhancement