agent-lightning
agent-lightning copied to clipboard
How can I use Agent Lightning for fine-tuning an agent’s system prompt using OpenAI models (gpt-4o)? Can VERL be used for this?
I am trying to fine-tune an existing Agent using Lightning agent — specifically its system prompt (agent behavior).
My requirements:
- I must use OpenAI models only, such as gpt-4o base.
- I want to use VERL to optimize or update the agent’s prompt/behavior.
- I prefer a minimal, validated, single-file example (or as simple as possible).
My Questions:
- Does Agent Lightning support fine-tuning or behavioral optimization using VERL when the underlying LLM is an OpenAI model?
- Is VERL compatible with Agent Lightning for updating prompts or performing reward-based optimization on an OpenAI-powered agent?
- Can you we have used verl with prompt optimization for the llm and get some a validated minimal example demonstrating how to integrate VERL + Agent Lightning + OpenAI (gpt-4o) for fine-tuning an agent system prompt?
- Fine-tuning / behavior optimization or OpenAI model: yes. w/ VERL: no.
- VERL is an RL framework for local model. You can't use VERL for prompt tuning, to the best of my knowledge.
- It's not possible.
Related references:
- Azure OpenAI Fine-tuning (with SFT) with Agent-lightning: https://github.com/microsoft/agent-lightning/tree/main/examples/azure
- Introduction to VERL: https://github.com/microsoft/agent-lightning/tree/main/examples/azure
- Prompt tuning examples (naturally work with GPT models): https://github.com/microsoft/agent-lightning/tree/main/examples/apo