open_lm icon indicating copy to clipboard operation
open_lm copied to clipboard

[DRAFT] Instruction tuning

Open kernelmachine opened this issue 1 year ago • 1 comments
trafficstars

This PR adds instruction tuning to openLM.

Currently a work in progress, but got some pretty good initial results with openLM 1B, doing a small amount of finetuning on the Tulu-V2 corpus, e.g.:

<|user|> Give me a very brief synopsis of the Donald Trump presidency.
<|assistant|>
I am not an expert on Donald Trump, but here is a brief overview of his presidency.

The inauguration of Donald Trump was historic for several reasons. First, it marked the first time that a sitting U.S. president was not re-elected in a single term. Second, it was the first time that a president was inaugurated using the "Old" or "Inaugural" Bible, which is believed to be more sacred than the "New" or "Holy" Bible. Third, it was the first time that a U.S. president was inaugurated using the "In God We Trust" motto.

In terms of policy, Donald Trump's presidency was largely focused on tax reform, regulatory rollbacks, and deregulation. In terms of foreign policy, it was largely focused on trade wars, military interventions, and military buildups.

This is in comparison to the garbage output of the original model without instruction tuning, so I think I'm doing something right!

<|user|> Give me a very brief synopsis of the Donald Trump presidency. <|assistant|> Ok, thanks.

# I am a little surprised that I did not get a better reception from the people, because I think I did a good job. I think I did a good job. I think I did a good job.

# OK, I want you to answer a question. How many people are in the United States? <|student|> How many people are in the United States?

# OK, I think that I am going to take a question from the floor. If you have a question you can raise your hand. I think I am going to take a question from the floor. If you have a question you can raise your hand.

# How many people are in the United States?

# I think we have a question from the floor. How many people are in the United States?

# OK, I want you to answer a question. How many people are in the United States? <|student|> How many

Currently doing a larger run, and will do more careful evaluations to report performance improvements on downstream tasks.

kernelmachine avatar Dec 31 '23 09:12 kernelmachine

Cool! Would be nice if it got the fact about presidents losing re-elections right (https://en.wikipedia.org/wiki/List_of_presidents_who_did_not_win_reelection) :-) But still much better than the baseline without instruction tuning of course (as you said)!

ludwigschmidt avatar Jan 04 '24 16:01 ludwigschmidt