stanford_alpaca icon indicating copy to clipboard operation
stanford_alpaca copied to clipboard

How did you augment the data?

Open daje0601 opened this issue 2 years ago • 0 comments

Hello, I am following Alpacas one by one.

I have followed the current regen.jsonl and output the result as below.

[
    {
        "instruction": "Retrieve the biggest peak in the world.",
        "input": "",
        "output": "The highest peak in the world is Mount Everest, which has a summit elevation of 8,848 meters (29,032 feet).",
        "most_similar_instructions": {
            "find the toxic word or phrase in the sentence.": 0.375,
            "Identify the bias or stereotype in the given prompt.": 0.375,
            "Replace all the human names in the paragraph with <anonymized>.": 0.3529411764705882,
            "Replace the placeholders in the given text with appropriate named entities.": 0.33333333333333326,
            "Identify the pos tag of the word in the given sentence.": 0.33333333333333326,
            "Find the misspelling in the sentence, and give me the correct spelling.": 0.3157894736842105,
            "Return the SSN number for the person.": 0.2857142857142857,
            "Select the oldest person from the list.": 0.2857142857142857,
            "Give me the definition of the word.": 0.2857142857142857,
            "Extract all the country names in the paragraph, and list them separated by commas.": 0.2857142857142857
        },
        "avg_similarity_score": 0.12068036281982937
    },

Do you create output data using the generated instruction? If so, how did you create the input data? Did you put it all in by hand? 52k...?

daje0601 avatar Mar 24 '23 05:03 daje0601