stanford_alpaca How did you augment the data?

How did you augment the data?

Open daje0601 opened this issue 2 years ago • 0 comments

Hello, I am following Alpacas one by one.

I have followed the current regen.jsonl and output the result as below.

[
    {
        "instruction": "Retrieve the biggest peak in the world.",
        "input": "",
        "output": "The highest peak in the world is Mount Everest, which has a summit elevation of 8,848 meters (29,032 feet).",
        "most_similar_instructions": {
            "find the toxic word or phrase in the sentence.": 0.375,
            "Identify the bias or stereotype in the given prompt.": 0.375,
            "Replace all the human names in the paragraph with <anonymized>.": 0.3529411764705882,
            "Replace the placeholders in the given text with appropriate named entities.": 0.33333333333333326,
            "Identify the pos tag of the word in the given sentence.": 0.33333333333333326,
            "Find the misspelling in the sentence, and give me the correct spelling.": 0.3157894736842105,
            "Return the SSN number for the person.": 0.2857142857142857,
            "Select the oldest person from the list.": 0.2857142857142857,
            "Give me the definition of the word.": 0.2857142857142857,
            "Extract all the country names in the paragraph, and list them separated by commas.": 0.2857142857142857
        },
        "avg_similarity_score": 0.12068036281982937
    },

Do you create output data using the generated instruction? If so, how did you create the input data? Did you put it all in by hand? 52k...?

Mar 24 '23 05:03 daje0601

stanford_alpaca stanford_alpaca copied to clipboard

How did you augment the data?

stanford_alpaca
stanford_alpaca copied to clipboard