OpenAdapt Design: Fine-Tuning

Feature request

We would like to implement fine-tuning.

This task involves considering the tradeoffs between various approaches to improving action completions and outcome evaluation via fine-tuning.

More generally, this also involves:

Creating a training set
Fine tuning on that training set
Comparing the results

Motivation

https://arxiv.org/abs/2406.03679

Autonomous agents that control computer interfaces to accomplish human tasks are emerging. Leveraging LLMs to power such agents has been of special interest, but unless fine-tuned on human-collected task demonstrations, performance is still relatively low.

Bounty

A paid bounty is available. Please suggest a price range 🙏

May 03 '23 02:05 abrichr

Currently iterating on this issue through #327, by figuring out failure cases by testing various event sequences . To that end, current action items include:

Researching fine tuning on LLMs in general
writing a fine tuning pipeline for GPT-4 for Events.
generalizing the pipeline to arbitrary LLMs, the only exception being the model-specific API calls (HuggingFace and etc)

Jul 06 '23 00:07 FFFiend

https://medium.com/@jeremyarancio/fine-tune-an-llm-on-your-personal-data-create-a-the-lord-of-the-rings-storyteller-6826dd614fa9

Useful article, goes over training and techniques like quantization and LoRA as well. Pretty educational to get an idea of what fine tuning an LLM looks like.

Some immediate action items may include:

1)working more closely with mind2web's codebase once they release the fine tuning code. The reason I say this is because training above in the article seems super black boxed to me, i.e it's not clear to me how/where the LLM is being shown the right answer to a given input when generating a completion.

Dataset of Window and Action Events. Can just distill from our recordings and pool to create a dataset comprising of these event Dicts that we can use for training, validation and testing.

Jul 07 '23 06:07 FFFiend

I think we want something like:

python -m openadapt.finetune --recording_id <recording_id> --model <model_name>

Jul 07 '23 15:07 abrichr

https://platform.openai.com/docs/guides/fine-tuning if you scroll down a little you can see that neither GPT-4 nor GPT-3.5-turbo are available for fine tuning at the moment 😞 We could use the davinci base model, although I'm now curious as to what model Mind2Web do their fine tuning on 🤔

Jul 08 '23 06:07 FFFiend

@bi-loop any interest? 🙏

Jun 13 '24 17:06 abrichr

OpenAdapt
OpenAdapt copied to clipboard

Design: Fine-Tuning

Feature request

Motivation

Related

Bounty

OpenAdapt OpenAdapt copied to clipboard

Design: Fine-Tuning

Feature request

Motivation

Related

Bounty

OpenAdapt
OpenAdapt copied to clipboard