distilabel
distilabel copied to clipboard
Distilabel is a framework for synthetic data and AI feedback for engineers who need fast, reliable and scalable pipelines based on verified research papers.
## Which page or section is this issue related to? something like https://docs.haystack.deepset.ai/docs/pipelines might be better ## What are you documenting, or what change are you making in the documentation?
**Is your feature request related to a problem? Please describe.** A clear and concise description of what the problem is. Ex. I'm always frustrated when [...] Sometimes, pipelines don't fully...
**Is your feature request related to a problem? Please describe.** Add support for [serverless Nvidia NIM API](https://huggingface.co/blog/inference-dgx-cloud). **Describe the solution you'd like** As suggested, it will require the following: >...
**Is your feature request related to a problem? Please describe.** If we have an error in one of the records when running an argilla step, (i.e. `PreferenceToArgilla`), nothing will be...
Integrate `codecarbon` so that we can get an estimate of how many emissions the creation of a dataset has generated.
**Describe the bug** When a loader step is created using `make_generator_step` and something fails, we cannot control it right now. A case that's happened is the code trying to load...
Wrote the according to the following example at https://distilabel.argilla.io/latest/sections/how_to_guides/advanced/serving_an_llm_for_reuse/#serving-llms-using-vllm: ``` from distilabel.llms import OpenAILLM from distilabel.pipeline import Pipeline from distilabel.steps import LoadDataFromDicts from distilabel.steps.tasks import TextGeneration, UltraFeedback with Pipeline(name="serving-llm") as...
**Is your feature request related to a problem? Please describe.** To use 8 GPUs and two nodes for `vLLM` right now we need to define `tensor_parallel_size` and `pipeline_parallel_size` in class...