dbal icon indicating copy to clipboard operation
dbal copied to clipboard

The support for quoting identifiers is fundamentally broken

Open morozov opened this issue 4 years ago • 6 comments

The DBAL takes the responsibility for automatic quoting of identifiers but it doesn't work in many cases:

  1. The list of keywords is never complete and up to date.
  2. Apart from the keywords, the logic of detection whether the needed identifier needs to be quoted is platform-specific but there's no platform API for that.
  3. It's impossible to quote a keyword or another identifier that needs quoting manually.
  4. The DBAL-level Table class is responsible for object name "normalization" which will lead to bugs in edge cases (e.g. cannot represent columns Foo and foo on the same table). See #4689.

The details are to follow.

morozov avatar Oct 18 '20 19:10 morozov

Object names are normalized assuming that all target platforms are case-insensitive about object names:

https://github.com/doctrine/dbal/blob/0e398ec80cc877acb7fdde653503638b78f30b8c/src/Schema/Table.php#L984-L991

This is not true for Oracle, IBM DB2 and SQL Server (depending on the configuration).

$connection = DriverManager::getConnection([
    'driver'   => 'oci8',
    'host'     => 'localhost',
    'user'     => 'system',
    'password' => 'oracle',
    'dbname'   => 'XE',
]);

$connection->executeStatement('CREATE TABLE test ("id" NUMBER, ID NUMBER)');
$sm = $connection->createSchemaManager();
$sm->listTableDetails('test');
// Doctrine\DBAL\Schema\SchemaException: The column 'id' on table 'test' already exists.

morozov avatar Sep 02 '21 03:09 morozov

The schema implementation assumes that the object names need to be quoted only if they are keywords or in some other platform-specific cases. For instance:

$connection = DriverManager::getConnection([
    'driver'   => 'oci8',
    'host'     => 'localhost',
    'user'     => 'system',
    'password' => 'oracle',
    'dbname'   => 'XE',
]);

$connection->executeStatement('CREATE TABLE test ("X Y" NUMBER)');
$sm = $connection->createSchemaManager();
$table = $sm->listTableDetails('test');
$online = clone $table;
$id = $table->getColumn('x y');
$id->setType(Type::getType('string'));
$diff = $sm->createComparator()->diffTable($online, $table);
$sm->alterTable($diff);
// ALTER TABLE test MODIFY (X Y VARCHAR2(255) DEFAULT NULL)
// ORA-00907: missing right parenthesis

An identifier obtained by introspecting the schema should be always quoted since it was introspected in its original case.

morozov avatar Sep 02 '21 04:09 morozov

But it's not only about proprietary platforms. A similar issue is reproducible on any platform, e.g. MySQL:

$connection = DriverManager::getConnection([
    'driver' => 'mysqli',
    'host'   => '127.0.0.1',
    'user'   => 'root',
    'dbname' => 'doctrine',
]);

$connection->executeStatement('CREATE TABLE `x.y` (id INT)');
$sm = $connection->createSchemaManager();
$table = $sm->listTableDetails('x.y');
$online = clone $table;
$id = $table->getColumn('id');
$id->setType(Type::getType('string'));
$diff = $sm->createComparator()->diffTable($online, $table);
$sm->alterTable($diff);
// ALTER TABLE x.y CHANGE id id VARCHAR(255) DEFAULT NULL
// Unknown database 'x'

The same logic as above applies: an identifier obtained by introspecting the schema should be always quoted.

morozov avatar Sep 02 '21 04:09 morozov

As an API consumer, I know that I need to quote identifiers in some cases and I do so:

// MySQL connection
$table = new Table('`x.y`');
var_dump($table->isQuoted());
// bool(true)
$table->addColumn('id', 'string');
$sm = $connection->createSchemaManager();
$sm->createTable($table);
// CREATE TABLE `x`.`y` (id VARCHAR(255) NOT NULL)
// Unknown database 'x'

The schema implementation parses the identifier even if it's quoted.

morozov avatar Sep 02 '21 04:09 morozov

Another example of manually quoting an identifier. Most suitable for PostgreSQL which supports schemas but fails on any platform:

// PostgreSQL connection
$table = new Table('x."y.z"');
$table->addColumn('id', 'string');
var_dump($table->isQuoted());
// false
$sm = $connection->createSchemaManager();
$sm->createTable($table);
// CREATE TABLE x."y (id VARCHAR(255) NOT NULL)
// SQLSTATE[42601]: Syntax error: 7 ERROR:  unterminated quoted identifier

The schema implementation fails to determine that the identifier is quoted and parses it incorrectly.

morozov avatar Sep 02 '21 04:09 morozov

Even if the object name is not qualified but is quoted, the schema implementation may fail to parse its name:

// trying to pass just the name
$table = new Table('`');
var_dump($table->getName());
// string(0) ""

// trying to pass just the name properly escaped
$table = new Table('````');
var_dump($table->getName());
// string(0) ""

Instead of parsing the name which requires an SQL parser, schema assets should accept the name as a string and a flag defining whether it's quoted.

morozov avatar Sep 02 '21 04:09 morozov