seq2struct icon indicating copy to clipboard operation
seq2struct copied to clipboard

`primary_keys` of Spider preprocessing is wrong

Open rshin opened this issue 5 years ago • 0 comments

Example:

In [6]: train_enc = json.loads(next(open('data/spider-20190205/nl2code-0401,output_from=false,emb=glove-42B,min_freq=50/enc/train.jsonl')))
In [7]: train_enc
Out[7]:
{'column_to_table': {'0': None,
  '1': 0,
  '10': 1,
  '11': 2,
  '12': 2,
  '13': 2,
  '2': 0,
  '3': 0,
  '4': 0,
  '5': 0,
  '6': 0,
  '7': 1,
  '8': 1,
  '9': 1},
 'columns': [['<type: text>', '*'],
  ['<type: number>', 'department', 'id'],
  ['<type: text>', 'name'],
  ['<type: text>', 'creation'],
  ['<type: number>', 'ranking'],
  ['<type: number>', 'budget', 'in', 'billions'],
  ['<type: number>', 'num', 'employees'],
  ['<type: number>', 'head', 'id'],
  ['<type: text>', 'name'],
  ['<type: text>', 'born', 'state'],
  ['<type: number>', 'age'],
  ['<type: number>', 'department', 'id'],
  ['<type: number>', 'head', 'id'],
  ['<type: text>', 'temporary', 'acting']],
 'db_id': 'department_management',
 'foreign_keys': {'11': 1, '12': 7},
 'foreign_keys_tables': {'2': [0, 1]},
 'primary_keys': [11, 11, 11],
 'question': ['how',
  'many',
  'heads',
  'of',
  'the',
  'departments',
  'are',
  'older',
  'than',
  '56',
  '?'],
 'table_bounds': [1, 7, 11, 14],
 'table_to_columns': {'0': [1, 2, 3, 4, 5, 6],
  '1': [7, 8, 9, 10],
  '2': [11, 12, 13]},
 'tables': [['department'], ['head'], ['management']]}

'primary_keys': [11, 11, 11], is obviously incorrect.

rshin avatar Sep 24 '19 23:09 rshin