catena
catena copied to clipboard
Support column specifiers and resolve ambiguity
When more than one table is involved in a statement, column names can be ambiguous:
SELECT * FROM answers LEFT JOIN questions ON qid = id;
The above fails when both answers as well as questions have an id column. The columns need to be specified, like so:
SELECT * FROM answers LEFT JOIN questions ON qid = questions.id;
Better even would be to forbid any non-specified columns; when an ID column is added to either table later on, the query (which may be a template) could start doing something unintended.
The validator should work as follows:
- For each statement that defines a scope (currently only SELECT), determine which tables are referenced and their aliases
- For each column reference, determine whether (1) it is unambiguous (only one table in scope) or (2) check whether it has a specifier.
- For each fully specified column, check whether the referenced column is actually available.
For SQL canonical form we should opt for writing out all table names when there is more than one table in scope.
- [ ] Change validator to check for ambiguity
- [ ] Add support for table specifiers in the parser
- [ ] Add specifiers in the canonical serialization when there is potential for ambiguity