openchat icon indicating copy to clipboard operation
openchat copied to clipboard

the model of Pre-tokenized dataset openchat_v3.2_super.train.parquet is Llama2 or Mistral?

Open alphanlp opened this issue 1 year ago • 1 comments

the model of Pre-tokenized dataset openchat_v3.2_super.train.parquet is Llama2 or Mistral?

alphanlp avatar Jan 27 '24 14:01 alphanlp

It is Llama2, not Mistral. However, the data is a merge of sharegpt_clean.json and sharegpt_gpt4.json. Who knows?

alphanlp avatar Jan 27 '24 15:01 alphanlp