alpaca-lora icon indicating copy to clipboard operation
alpaca-lora copied to clipboard

sharing experimental results like a chatbot

Open deep-diver opened this issue 2 years ago • 11 comments

7B

demo link: https://notebooksd.jarvislabs.ai/tz0hyvPyQMO0qXbeTpLYv2Wu5-j9AfFm_dM9sQN5fqFGI0lT90sAIHgT-Gi0jLcX/

2023-03-17_2 19 32

deep-diver avatar Mar 17 '23 05:03 deep-diver

Compared with ChatGPT, the effect is still much worse. How can we narrow the gap with ChatGPT?

cxj01 avatar Mar 17 '23 08:03 cxj01

Bigger model, more data with much better quality than now

deep-diver avatar Mar 17 '23 08:03 deep-diver

Compared with ChatGPT, the effect is still much worse. How can we narrow the gap with ChatGPT?

From the given examples, I wouldn't say it's much worse. It's just more direct and doesn't waste tokens, which is a good thing. With more complicated queries, it shits the bed and hallucinates, but then again, it's a 7B model. It's already very impressive.

HideLord avatar Mar 17 '23 08:03 HideLord

For example, I want alpaca to be able to answer questions about a novel. How do I get alpaca to learn about the novel? Since the data I see tweaked so far is in prompt mode, can I only use data in prompt format for tweaking?

cxj01 avatar Mar 17 '23 08:03 cxj01

more example with 13B at this time

1 2 3

deep-diver avatar Mar 17 '23 14:03 deep-diver

Nice! I'm finetuning 13b myself, but I've plateaued quite early. image Did you experience the same behavior?

HideLord avatar Mar 17 '23 15:03 HideLord

Yeah kind of

I will be more focusing on exploring how different combinations of hyper-params at generation time will effect the quality and speed!

deep-diver avatar Mar 17 '23 15:03 deep-diver

Compared with ChatGPT, the effect is still much worse. How can we narrow the gap with ChatGPT?

From the given examples, I wouldn't say it's much worse. It's just more direct and doesn't waste tokens, which is a good thing. With more complicated queries, it shits the bed and hallucinates, but then again, it's a 7B model. It's already very impressive.

@HideLord @deep-diver There were a TON of hallucinations in the original Stanford dataset. I cleaned up hundreds of issues. Try re-training on the new cleaned dataset. If you get a chance, please post a 13b fine-tuned model. Some of us have SLOW GPU's.

gururise avatar Mar 17 '23 22:03 gururise

Thanks @gururise

I am retraining with 13B and 30B at the same time. Will share if I find something useful

deep-diver avatar Mar 18 '23 08:03 deep-diver

Unfortunately, the training crashed before it finished, but here are the logs: image The data points were generated with a trimmed version:

def generate_prompt(data_point):
    instruction_field = data_point["instruction"]
    input_field = data_point["input"]
    output_field = data_point["output"]
    if data_point["input"]:
        return f"<<Instruction>>:\n{instruction_field}\n{input_field}\n<<Output>>:\n{output_field}"
    else:
        return f"<<Instruction>>:\n{instruction_field}\n<<Output>>:\n{output_field}"

Here is the final checkpoint if somebody is interested: https://huggingface.co/hidelord/llama-13b-lora/tree/main

HideLord avatar Mar 18 '23 10:03 HideLord

"continue" works in 13B. That was something didn't work with 7B model in my case.

currently finetuning 13B model with Korean instruction datasets to see how well it works with different language, and 30B model with the original dataset.

캡처

deep-diver avatar Mar 18 '23 17:03 deep-diver