databricks-llm-prompt-engineering icon indicating copy to clipboard operation
databricks-llm-prompt-engineering copied to clipboard

Bitext Dataset links outdated and Key Error for 'Test'

Open PostArchitekt opened this issue 1 year ago • 0 comments

The bitext huggingface data link has changed, so this line is now: ds = datasets.load_dataset("bitext/Bitext-customer-support-llm-chatbot-training-dataset")

I changed the next line error for "utterances" to "instruction" but not sure if that is correct because I'm unsure of what the previous data columns were:

ds = ds.rename_columns({
  "instruction": "text",
  "intent": "labels"
}).remove_columns(["category", "flags"])

Finally, I'm currently stuck getting key error for test:

train_df = ds["train"].to_pandas()
test_df = ds["test"].to_pandas()

labels = list(set(list(train_df.labels.unique()) ))

Edit: It seems that there is a not a 'test' for evaluation any longer

Any suggestions would be appreciated

PostArchitekt avatar Oct 06 '23 17:10 PostArchitekt