LMOps
LMOps copied to clipboard
[tuna] Libraries are conflicting and/or very aged
So disappointed of what is released here. these are just non working pieces. Funny that in train.py for example you have: from custom import CustomTrainer, but custom is actually have only TunaTrainer. also where in the code gpt_eval is called, the README never described . Environment and library installation is another joke!
I'm sure that no one of author will read this comments. So waste of my 3 days spent here
@batawfic Could you be more specific about which project you were trying to fix?
I just searched TunaTrainer and found this folder https://github.com/microsoft/LMOps/tree/main/tuna .
@XingxingZhang and @haorannlp can help with this issue.
So disappointed of what is released here. these are just non working pieces. Funny that in train.py for example you have: from custom import CustomTrainer, but custom is actually have only TunaTrainer. also where in the code gpt_eval is called, the README never described . Environment and library installation is another joke!
I'm sure that no one of author will read this comments. So waste of my 3 days spent here
Got it, I will look into this today.
So disappointed of what is released here. these are just non working pieces. Funny that in train.py for example you have: from custom import CustomTrainer, but custom is actually have only TunaTrainer. also where in the code gpt_eval is called, the README never described . Environment and library installation is another joke!
I'm sure that no one of author will read this comments. So waste of my 3 days spent here
For train.py
, I've removed the from custom import CustomTrainer
line as it does not affect the training process. I forgot to clean this script at the first commit, sorry for the confusion.
train.py
is used for Supervised finetuning (SFT), which is borrowed from https://github.com/AetherCortex/Llama-X
, please refer to Llama-X
repo for a more comprehensive explanation/discussion.
train_tuna.py
is used for learning from the rankings.
gpt_eval.py
is used for querying GPT-4 models for generating contextual ranking data. This script was only for illustration purpose and was not called in this repo. We've provided the GPT-4 ranking data in ./gpt_data
folder.
For python environment installation, could you be more specific on what problems/errors you've encountered so that I can guide you through this installation process. Alternatively, you can search in Llama-X
repo to see if there are similar issues if we are not able to respond promptly.
Thanks.
So disappointed of what is released here. these are just non working pieces. Funny that in train.py for example you have: from custom import CustomTrainer, but custom is actually have only TunaTrainer. also where in the code gpt_eval is called, the README never described . Environment and library installation is another joke! I'm sure that no one of author will read this comments. So waste of my 3 days spent here
For
train.py
, I've removed thefrom custom import CustomTrainer
line as it does not affect the training process. I forgot to clean this script at the first commit, sorry for the confusion.train.py
is used for Supervised finetuning (SFT), which is borrowed fromhttps://github.com/AetherCortex/Llama-X
, please refer toLlama-X
repo for a more comprehensive explanation/discussion.train_tuna.py
is used for learning from the rankings.gpt_eval.py
is used for querying GPT-4 models for generating contextual ranking data. This script was only for illustration purpose and was not called in this repo. We've provided the GPT-4 ranking data in./gpt_data
folder.For python environment installation, could you be more specific on what problems/errors you've encountered so that I can guide you through this installation process. Alternatively, you can search in
Llama-X
repo to see if there are similar issues if we are not able to respond promptly.Thanks.
@haorannlp Thanks for getting back to me. I honestly wasn't expecting that. Here is summary of some of the issue: raw_dataset = load_dataset("json", data_files=data_args.data_path, split="train") <-- the data is list of JSON not JSON with key word train, to fix that I had to modify the code and install datasets=2.10, pyarrow==15 and instruct the code to read jsonl not json
Also I have to upgrade deepspeed==0.13
After all this when running and read the data it hangs for ever. I'm unclear what is the issue
also gpt_eval never run, below line in gpt_eval doesn't work with error: OSError: source code not available if name == "main": fire.Fire(GroupEval)
I gave up honestly to get the examples working. I was using databricks A100 1 GPU with 1 Node. and Python 3.10. Please note that requirement for Llama-X: # CUDA 11.6 conda install pytorch==1.12.0 torchvision==0.13.0 torchaudio==0.12.0 cudatoolkit=11.6 -c pytorch -c conda-forge are very old and I was unclear if updating these will break anything