distilabel
distilabel copied to clipboard
Distilabel is a framework for synthetic data and AI feedback for engineers who need fast, reliable and scalable pipelines based on verified research papers.
**Describe the bug** When you have a pipeline like the following: ```python with Pipeline(name="pipe") as pipe: ... chat_generations = [ ChatGeneration( llm=InferenceEndpointsLLM(model_id=model_id), input_mappings={"messages": "conversation"}, input_batch_size=20, ) for model_id in (...
**Is your feature request related to a problem? Please describe.** We want to include some basic telemetry to understand the usage of distilabel and shoot that to the hub. **Describe...
**Is your feature request related to a problem? Please describe.** If we have a pipeline `a >> b >> c` and we create another `a >> b >> c >>...
I saw a PR where there was support for VertexAIEndpointLLM. But I can't find it in the latest version. Could you please add support for VertexAIEndpointLLM (which can enable users...
**Is your feature request related to a problem? Please describe.** A clear and concise description of what the problem is. Ex. I'm always frustrated when [...] **Describe the solution you'd...
**Is your feature request related to a problem? Please describe.** Exploring pipeline through dict/yaml can be difficult. **Describe the solution you'd like** I would perhaps like to see something like...
### [WIP] Adds 01ai Client to Distillabel 01ai recently released their 01ai API and it's serving `Yi-large` which is a great model for distillabel purposes. ### Open Tasks - ~~[...
**Is your feature request related to a problem? Please describe.** I think column operations for distilabel are not always very intuitive. It might be easier for a user to pass...
## Which page or section is this issue related to? A lot of them. ## What are you documenting, or what change are you making in the documentation? Not all...
**Is your feature request related to a problem? Please describe.** I want to generate question answering data. **Describe the solution you'd like** The idea is to adapt https://distilabel.argilla.io/1.2.4/components-gallery/tasks/generatetextclassificationdata/ to token...