distilabel icon indicating copy to clipboard operation
distilabel copied to clipboard

Distilabel is a framework for synthetic data and AI feedback for engineers who need fast, reliable and scalable pipelines based on verified research papers.

Results 168 distilabel issues
Sort by recently updated
recently updated
newest added

when i run the following code on my m1 macbook pro (mac os 14.4.1 (23E224)) ```python from distilabel.pipeline import Pipeline from distilabel.llms.llamacpp import LlamaCppLLM from distilabel.steps import LoadDataFromDicts from distilabel.steps.tasks...

dependencies

## Description This PR adds the following steps in order to format the batches into what the main fine-tuning frameworks / libraries (i.e. `axolotl` and `alignment-handbook`) expect for both DPO...

enhancement

Create a signal handler that captures `SIGINT` (ctrl + c) and stops the pipeline gracefully.

enhancement

## Description This PR adds a new field `distilabel_meta` to store general outputs related to distilabel. Currently we will have `distilabel_id` with a UUID, and in case of `Tasks` that...

enhancement

## Description As mentioned by @alvarobartt and Ellamind team, it would be nice to have a sequential model for executing the pipeline, in which no multiprocessing & batching is used....

enhancement

**Is your feature request related to a problem? Please describe.** Currently, the caching system only works for full batches and doesn't seem to work across code changes. I think it...

**Is your feature request related to a problem? Please describe.** I'm testing a preference pipeline with Llama3 and the parsed outputs are weird (long lists of ratings and rationales when...

enhancement

## Description Some steps, specially the `Task` subclasses that run a local `LLM` tend to consume much memory, and that should be released once the step is completed, as otherwise...

improvement

## Description Currently `distilabel` can be only executed locally executing either a python script with the pipeline or using the CLI. For future integrations with Argilla UI or deploying `distilabel`...

enhancement

## Description Currently, the pipelines that can be executed directly from the CLI or when loaded with `Pipeline.from_yaml` are those which include only steps from `distilabel` package. The idea is...

enhancement