openai-python icon indicating copy to clipboard operation
openai-python copied to clipboard

OpenAI CLI Tools for Chat Fine-Tuning

Open henriqueln7 opened this issue 10 months ago • 6 comments

Describe the feature or improvement you're requesting

Hello everyone,

When using legacy fine-tuning, I find the OpenAI CLI extremely helpful due to its numerous tools. For instance, the Prepare Data Helper and the Create Fine-Tuning are particularly useful.

However, these tools only apply to legacy models, which consist of JSON with prompt and completion keys.

I propose the addition of operations to the existing CLI that can perform the same functions for the new chat fine-tuning.

My Proposal

  • For the sake of backwards compatibility, we could create a new subcommand called chat_fine_tunes.
    • This subcommand would inherit all operations that fine_tunes can perform, such as assisting with data preparation, etc. We can simply replicate the existing operations with minor modifications to suit the new format.

Additional context

I am open to working on this feature if it is approved.

henriqueln7 avatar Sep 21 '23 21:09 henriqueln7

hello

mina6765 avatar Sep 22 '23 17:09 mina6765

Hi @henriqueln7 , do you remain interested in working on this? What interface would you propose?

rattrayalex avatar Nov 10 '23 03:11 rattrayalex

Hey, @rattrayalex. I indeed remain interested in working on this :)

I propose the creation of a new subcommand called chat_fine_tunes. It would function as follows:


# This subcommand would assist with `.json, .jsonl` files. The formats `.csv, .txt, .tsv, .xlsx` seem incompatible with this new format (I am open to suggestions here).
# The new subcommand will perform the same operations that already exist:
# - Checking for potential improvements (removing duplicates, verifying the presence of system messages)
# - Generating a `file_prepared.jsonl` file suitable for fine-tuning
openai tools chat_fine_tunes.prepare_data -f <LOCAL_FILE>

# Create a fine_tune job
openai api chat_fine_tunes.create -t <TRAIN_FILE_ID_OR_PATH> -m <BASE_MODEL>

# List existing fine-tunings
openai api chat_fine_tunes.list

# Retrieve the status of a fine-tuning job. The output includes
# the job status (which can be pending, running, succeeded, or failed),
# among other details.
openai api chat_fine_tunes.get -i <YOUR_FINE_TUNE_JOB_ID>

# Cancel a fine-tuning job
openai api chat_fine_tunes.cancel -i <YOUR_FINE_TUNE_JOB_ID>

Questions

When I initially proposed this change, version 1.0 of the CLI had not been introduced. I noticed that all openai api fine_tunes commands were removed (although they are still mentioned in the documentation). Are there plans to also phase out the existing support for data preparation in the legacy manner? If that's the case, maybe it would be better for me to adapt the existing command rather than creating a new one.

henriqueln7 avatar Nov 11 '23 19:11 henriqueln7

Thanks @henriqueln7 ! We'd be open to PR's for this. @jhallard can help with questions.

rattrayalex avatar Nov 15 '23 00:11 rattrayalex

Hi, I see this issue has been pending for a while. I have developed a solution and would like to contribute by submitting a PR. Would that be alright with everyone involved @rattrayalex?

aanaseer avatar Mar 08 '24 16:03 aanaseer

Please do! PRs are always welcome.

rattrayalex avatar Mar 09 '24 03:03 rattrayalex