dbal icon indicating copy to clipboard operation
dbal copied to clipboard

Fix implicit/generated name collisions

Open mvorisek opened this issue 3 years ago • 8 comments

Q A
Type bug
Fixed issues #5562

Summary

When a name is generated and trimmed to a specific length, the hash must be calculated once before trim.

Also, generated FK name must use table and column names of both sides, otherwise the generated name might be the same for different foreign keys.

mvorisek avatar Jul 11 '22 13:07 mvorisek

Please provide the steps to reproduce the problem first. Fixing an edge case in a patch release thus requiring a migration may do more harm than good.

morozov avatar Jul 12 '22 14:07 morozov

It is a bug, but changing target to 3.4.x as BC break due different generated names is introduced.

mvorisek avatar Jul 24 '22 22:07 mvorisek

PR is done incl. tests, is there any feedback left and should I squash it into one commit?

mvorisek avatar Jul 26 '22 21:07 mvorisek

Can this PR be reviewed please?

mvorisek avatar Jul 31 '22 19:07 mvorisek

Can this PR be reviewed please?

Please document the problem first, not what needs to be changed to address it.

morozov avatar Aug 03 '22 01:08 morozov

Please document the problem first, not what needs to be changed to address it.

Documented in https://github.com/doctrine/dbal/issues/5562

mvorisek avatar Aug 03 '22 02:08 mvorisek

How to reproduce

please see tests in https://github.com/doctrine/dbal/pull/5490

Thank you but that doesn't really help. I'm not going to review this.

morozov avatar Aug 03 '22 06:08 morozov

How to reproduce

please see tests in #5490

Thank you but that doesn't really help. I'm not going to review this.

the steps to reproduce are in https://github.com/doctrine/dbal/pull/5490/files#diff-a9e6eb632904e1a48e50c32b48430e910ca9cec1702c8c6d1c1221401ef65509R864

What would help you, how can I improve the issue description? Is there anything I need to clarify better?

mvorisek avatar Aug 03 '22 08:08 mvorisek

As I see from your test, there is an issue with the table that has 50+ columns and four indices that consist of 50+ columns each with slight variations. When do you need such a schema?

morozov avatar Aug 12 '22 19:08 morozov

As I see from your test, there is an issue with the table that has 50+ columns and four indices that consist of 50+ columns each with slight variations. When do you need such a schema?

I have choosen 50 to assert the test will work even with much higher max. identifier length. The current issue is present with 4+ columns since dechex(crc32($column)) is 8 characters long and 4 * 8 > 30 (max. identifier length for Oracle 11g).

mvorisek avatar Aug 12 '22 19:08 mvorisek

So the collision happens if there are two indexes with the five first matching columns. What's the point in having such indexes? They are virtually identical. Their efficiency is also questionable given that they have to be very long/deep (up to the fifth column).

morozov avatar Aug 12 '22 20:08 morozov

FK with local table, local column, foreign table, foreign column names are enought to reproduce (/w Oracle 11g as then the max. identifier length is 30).

The hashing process must hash the names at the end before cropping, this is what the test tests.

mvorisek avatar Aug 12 '22 20:08 mvorisek

The support for Oracle 12c and older was deprecated in https://github.com/doctrine/dbal/pull/5109 (3.3.0). I don't believe we should be introducing potentially breaking changes to support a deprecated dependency.

morozov avatar Aug 12 '22 21:08 morozov

The support for Oracle 12c and older was deprecated in #5109 (3.3.0). I don't believe we should be introducing potentially breaking changes to support a deprecated dependency.

For newer Oracle versions (and other DBs) the https://github.com/doctrine/dbal/pull/5490/files#diff-a7e8b22dc607a6c91123e8c8200250644ef2ce5244387d21916e74806495ac7eR20 change is not needed. Should I revert it from this PR into 3.4.x?

In 4.0.x, the max. identifier length of 63 is fine, as the older Oracle versions is no longer supported.

The hash fix should be landed, as any arbitrary limitation is undesired and should be fixed. Such fix should justify potentially breaking change of changed hash/generator function.

mvorisek avatar Aug 12 '22 21:08 mvorisek

What is the scenario in which the issue is reproducible on a database with the index name limit of 63 characters?

morozov avatar Aug 12 '22 21:08 morozov

With 8+ names as the hash/generator function input. So for index /w 8+ columns, for FK with 3+ columns (2x table name, 3x2 local/foreign column).

mvorisek avatar Aug 12 '22 22:08 mvorisek

As I said earlier, in my understanding, an index with such a large number of columns is practically useless so I wouldn't consider it a reasonably valid use case.

As for the foreign keys, as far as I can tell, their name is generated only from the table name and local columns: https://github.com/doctrine/dbal/blob/b5b9caf6e1db112ef48d9c46e25859af73433f7e/src/Schema/Table.php#L409-L413

Where does that ×2 factor come from?

morozov avatar Aug 12 '22 22:08 morozov

see 2nd bullet of the https://github.com/doctrine/dbal/issues/5562 issue, this PR fixes all hash/generator issues

mvorisek avatar Aug 12 '22 22:08 mvorisek

FK constraint is a 4-tuple consisting of a local/foreign table/column names

Where does this come from?

currently, only the local names are part of the generated name

That's my understanding as well.

If you want these changes in the code considered, please provide a specific scenario in which the existing code doesn't work, not the reasoning about how it should or shouldn't work.

The existing scenario with 50 columns isn't valid.

morozov avatar Aug 12 '22 22:08 morozov

Closing due to the lack of feedback.

morozov avatar Sep 18 '22 19:09 morozov