Evaluation returns 1 when it should return 0

Open Hazoom opened this issue 4 years ago • 1 comments

Hi,

I was running the evaluation script on my predicted SQL queries on Spider dataset, and I've noticed some for some examples, the evaluation script returns an Exact-Match score of 1 instead of 0.

For example:

Pred: select students.first_name from students where students.permanent_address_id != students.permanent_address_id
Gold: select first_name from students where current_address_id != permanent_address_id

In this example, one can notice the in the gold query, the where clause is using the current_address_id column in the left expression while in the predicted query the column is permanent_address_id. This should lead to an EM score of 0 in the where clause, thus leading to a overall EM score of 0, while your script return 1.

Another example:

Pred: select count(*) from flights where flights.destairport = 'terminal'
Gold: select count(*) from flights where sourceairport = "apg"

Here, the problem is the same, but with the columns destairport and sourceairport.

I looked into the code, and my guess is that it relates to the foreign key mapping that is performed right at the beginning of the evaluation of each sample. Lines 621-627 in here: https://github.com/taoyds/test-suite-sql-eval/blob/master/evaluation.py#L621

Would love to hear your thoughts on that. @taoyds

Thanks, Moshe

May 24 '21 15:05 Hazoom

If you change the variable DISABLE_VALUE = True to False in evaluation.py, you should see that the Exact-Match score is 0. I think this is because enabling allows the evaluation of the actual variables instead of only the syntax.

Jun 18 '21 08:06 ReinierKoops