AgentBench icon indicating copy to clipboard operation
AgentBench copied to clipboard

dbbench-std: Task Output Seems Correct But MD5 Mismatches

Open wchen-github opened this issue 1 year ago • 1 comments

I looked into one particular DbBench task. GPT4 seems to have give the right answer but MD5 doesn't match.

Steps to reproduce the behavior:

  1. Run a task with line #106 of dbbench/standard.jsonl: {"description": "The film titled 'New Movie' will be added to the Filmography table with the lead actor role and a note of '-' for the year 2019.", "label": ["INSERT INTO Filmography (Year, Title, Role, Notes) VALUES ('2019', 'New Movie', 'Lead Actor', '-')"], "create": {"database": "fetaqa", "init": "fetaqa_init.sql"}, "table": {"table_name": "Filmography", "table_info": {"columns": [{"name": "Year", "type": "INT"}, {"name": "Title", "type": "TEXT"}, {"name": "Role", "type": "TEXT"}, {"name": "Notes", "type": "TEXT"}], "rows": [["1985", "Back to the Future", "Jennifer Parker", "-"], ["2008", "Still Waters Burn", "Laura Harper", "-"], ["2011", "Alien Armageddon", "Eileen Daly", "-"], ["2013", "You Are Not Alone", "Cristina's Mom", "Short film"], ["2013", "Max", "Mom", "Short film"], ["2014", "Starship: Rising", "Captain Savage", "-"], ["2015", "EP/Executive Protection", "Pam Travis", "-"], ["2015", "Back in Time", "Herself", "Back to the Future documentary"], ["2015", "Back to the 2015 Future", "Jennifer Parker", "Short film"], ["2017", "Vitals", "Margaret Parks", "-"], ["2018", "Groove Street", "Julie", "-"], ["1999", "The Matrix", "Trinity", "-"], ["2005", "Batman Begins", "Rachel Dawes", "-"], ["2010", "Inception", "Mal", "-"], ["2012", "The Avengers", "Black Widow/Natasha Romanoff", "-"], ["2014", "Interstellar", "Brand", "-"], ["2016", "La La Land", "Mia Dolan", "-"], ["2017", "Wonder Woman", "Wonder Woman/Diana Prince", "-"], ["2019", "Avengers: Endgame", "Black Widow/Natasha Romanoff", "-"], ["2021", "The Suicide Squad", "Harley Quinn", "-"], ["2022", "Black Panther: Wakanda Forever", "Okoye", "-"]]}}, "evaluation": "", "example": "", "type": ["INSERT"], "heads": ["Year", "Title", "Role", "Notes"], "add_description": "The name of this table is Filmography, and the headers of this table are Year,Title,Role,Notes.", "source": "fetaqa", "answer_md5": "[('ae2213ddbcb907c43fd757035b363328',)]"}

  2. Get the output SQL command and MD5 from the output/runs.jsonl file:

image

  1. Print out the modified table in dbbench.interaction.execute:

image

  1. Get the MD5 from the dataset and compared the one in the output:

image

  • OS: Ubuntu 22.04
  • Python: 3.9

This is only one example I collected. There are many errors of similar kind. Can you help me identify the issues I am facing, please?

wchen-github avatar Jan 24 '24 17:01 wchen-github