Return numeric-string instead of just string to represent row count
| Q | A |
|---|---|
| Type | improvement |
| Fixed issues | Follows up https://github.com/doctrine/dbal/pull/5269 |
Summary
The returned string representing row count is always a number so it makes sense to constrain it a little bit more.
Looks good in general. I believe, the same idea applies to the QueryBuilder and the Driver\Connection interface and its implementations as well:
- https://github.com/doctrine/dbal/blob/7ac309579ac0a14e525144229c8feebf79e8d1e7/src/Query/QueryBuilder.php#L332
- https://github.com/doctrine/dbal/blob/7ac309579ac0a14e525144229c8feebf79e8d1e7/src/Query/QueryBuilder.php#L318
- https://github.com/doctrine/dbal/blob/e4df1a5834fb78ea2e58c4803b5b8934911561e6/src/Driver/Connection.php#L47
Thanks for input, will adapt the PR
On Thu, Mar 10, 2022, 17:24 Sergei Morozov @.***> wrote:
Looks good in general. I believe, the same idea applies to the QueryBuilder and the Driver\Connection interface and its implementations as well:
https://github.com/doctrine/dbal/blob/7ac309579ac0a14e525144229c8feebf79e8d1e7/src/Query/QueryBuilder.php#L332 2. https://github.com/doctrine/dbal/blob/7ac309579ac0a14e525144229c8feebf79e8d1e7/src/Query/QueryBuilder.php#L318 3. https://github.com/doctrine/dbal/blob/e4df1a5834fb78ea2e58c4803b5b8934911561e6/src/Driver/Connection.php#L47
— Reply to this email directly, view it on GitHub https://github.com/doctrine/dbal/pull/5317#issuecomment-1064249810, or unsubscribe https://github.com/notifications/unsubscribe-auth/AACQAJJIVKZVLC4RJXAW4EDU7IOZZANCNFSM5QL2EVVQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.
You are receiving this because you authored the thread.Message ID: @.***>
@morozov I believe changing
- public function exec(string $sql): int;
+ public function exec(string $sql): int|string;
results in hard BC break since I'll have to remove the type hint (union types not supported here yet)
- public function exec(string $sql): int;
+ public function exec(string $sql)
Should I proceed?
No, but it should be possible to add a @return or @psalm-return annotation that would override the native return type.
Ok so I keep :int but add @return int|string?
On Fri, Mar 11, 2022, 00:43 Sergei Morozov @.***> wrote:
No, but it should be possible to add a @return or @psalm-return annotation that would override the native return type.
— Reply to this email directly, view it on GitHub https://github.com/doctrine/dbal/pull/5317#issuecomment-1064627434, or unsubscribe https://github.com/notifications/unsubscribe-auth/AACQAJN46WWYAHTSZ47TU23U7KCJVANCNFSM5QL2EVVQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.
You are receiving this because you authored the thread.Message ID: @.***>
Ok so I keep :int but add @return int|string?
Yes, keep :int and add whatever (@return or @psalm-return) that will make static analyzers understand that it's in fact int|numeric-string.
Off-topic: note that it's not just int but in fact 0|positive-int (see https://github.com/phpstan/phpstan-src/pull/692 for example). While it's not currently necessary, it's not clear how to express a positive-negative-integer-string.
@morozov what about PHPDoc tag @return with type int|string is not subtype of native type int. phpstan errors?
[…] what about
PHPDoc tag @return with type int|string is not subtype of native type int.phpstan errors?
Would it help to remove the conflicting @return annotations and only leave the @psalm-return ones which don't seem to cause any issues?
Would it help to remove the conflicting
@returnannotations and only leave the@psalm-returnones which don't seem to cause any issues?
IMO this would not be good. If the PHP return type is : int then PHP will do type coercion so in the example of executeUpdate if the inner executeStatement returns a numeric-string because it overflowed PHP_INT_MAX, then it doesn't matter what your phpdoc tags say, it will try to return an int and it will fail (see https://3v4l.org/YaedP)
The only correct way here that I see is to remove the int return type from exec&executeUpdate IMO, and then use @return int|string everywhere for dumb parsers plus ideally @psalm-return int<0, max>|numeric-string for static analyzers.
Removing the type hint should not be a huge deal, classes extending it can still specify it without breaking LSP (at least in the 3.3 branch as this was allowed in PHP 7.2) so it's not a BC break I would say.
Note that the DBAL 3.x codebase doesn't use declare(strict_types=1) in most of its codebase. So right now the int return type also serves as an implicit cast of the strings potentially returned by the underlying drivers. I don't know if any driver returns the number of affected rows lower than PHP_INT_MAX as a string, but if they do, it might be a bigger BC break. I.e. the consuming code may expect an int but will now get a string regardless of the actual value.
Another thought is that it was never reported that a number of affected rows larger than PHP_INT_MAX couldn't be properly reported. Even on a 32-bit platform, it would have to be over 2 billion rows, which is quite a lot.
It is much more likely that changing the signature will affect the API consumers who don't experience this issue than it will fix the issue for those who experience it.
I am fine with introducing this minor BC break in a minor release but we have to be conscious about the real side effects it can lead to.
Note that the DBAL 3.x codebase doesn't use
declare(strict_types=1)in most of its codebase. So right now theintreturn type also serves as an implicit cast of the strings potentially returned by the underlying drivers.
If you look at my example (https://3v4l.org/YaedP) again, it also does not use strict types, and while it also surprised me, it seems like PHP fails the string->int cast if the numeric string is beyond PHP_MAX_INT. So technically you already have the issue with the current code.
But as you say, in practice even on 32bit doing an operation on 2billion rows is pretty damn rare, and 32bit systems themselves are getting rarer, so IMO it's fine to use a plain int return type here, and coerce whatever numeric string we get back to an int. I find this completely reasonable, but I do wonder why not apply the same logic to the other statements then?
If adding return types would be considered a BC break, we can still do return (int) $whatever; to explicitly cast.
I think this would be way less surprising for everyone, and lets you use a much leaner @return int<0, max> return type.
The bigint -> string type thing is a bit of a pain already, but I can understand the rationale there (tho I wish we could forget about 32bit), and having tables with ids going above 2billion is definitely not that far fetched. But again for affected rows a 32bit integer really sounds like it is good enough.
But again for affected rows a 32bit integer really sounds like it is good enough.
Agreed, plus the only 32bit system I had to recently work with is a raspberry pi 😁
Note that the DBAL 3.x codebase doesn't use
declare(strict_types=1)in most of its codebase.
Actually, all the classes implementing Driver\Result::rowCount(): int have declare(strict_types=1) in their files, which means that all drivers return an int for the values not greater than PHP_INT_MAX.
The only correct way here that I see is to remove the int return type from exec&executeUpdate IMO, and then use @return int|string everywhere for dumb parsers plus ideally @psalm-return int<0, max>|numeric-string for static analyzers.
Agree.
Removing the type hint should not be a huge deal, classes extending it can still specify it without breaking LSP (at least in the 3.3 branch as this was allowed in PHP 7.2) so it's not a BC break I would say.
Agree. It won't be a BC break.
Actually, all the classes implementing Driver\Result::rowCount(): int have declare(strict_types=1) in their files, which means that all drivers return an int for the values not greater than PHP_INT_MAX.
I am a bit confused, if that is the case already then why not keep the int return types on Connection and simply drop the string types?
If Driver\Result::rowCount(): int have declare(strict_types=1) already, it means that for int<0,PHP_INT_MAX> they will return an int, and if the php drivers return a string for values above then rowCount will throw a TypeError.
So having an int|string return type on Connection makes no sense as it will never return string.
As per https://github.com/doctrine/dbal/blob/83f779beaea1893c0bece093ab2104c6d15a7f26/src/Connection.php#L1157-L1160 the other way executeStatement can return is via DriverConnection::exec but that also has an int return type, so it is currently impossible for executeStatement to return a string.
I opened https://github.com/doctrine/dbal/pull/5334 to show more clearly what I suggested doing above in https://github.com/doctrine/dbal/pull/5317#issuecomment-1077349918
IMO this is the best outcome as it completely removes the string type, keeping things simpler for the API consumers, and it is anyway reflecting the current reality of the code better.
I am a bit confused, if that is the case already then why not keep the int return types on Connection and simply drop the string types?
Because the underlying driver can return a string.
Yes there is one driver (MySQLi) which can return a string as far as I have seen after looking at all drivers docs this morning. And it will do so only in the case that you have a query affecting over 2billion rows (super rare) AND only on 32bit platforms (super rare). I don't find it justified to add a string return type just for that one edge case, adding burden to every API consumer, but that's obviously just my perspective.
Also note that in the current state, this PR does not fix it as https://github.com/doctrine/dbal/blob/3.3.x/src/Driver/Mysqli/Result.php#L164-L171 and other Result classes still have the int return type, which will turn any overflowing int into a TypeError. So you will right now never get a string back, and the type def is too broad. You also never got anyone reporting such a TypeError I suppose, because as I mentioned above this is a very low probability edge case.
I would appeal to pragmatism here as forcing everyone to deal with int|string when reality is 99.9999999% of users will ever see an int, seems absurd.
P.S: Also another minor aspect I noticed this morning is that strictly speaking some drivers (SQLSrv and Mysqli as per https://github.com/doctrine/dbal/pull/5334/files) do return -1 in some error conditions, which does not fit the int<0, max> type. I don't think this can happen because if it errors most likely Doctrine won't create a Result, but for safety I added a max(0, $...) in that PR, ensuring the returned values match the documented types.
I realize that this is such an insignificant and at the same time such a complex problem that I'm not willing to spend time on solving it.
It is insignificant because the edge case is indeed rare. It is complex because different underlying drivers are handling it differently. I.e. SQL Server seems to be unable to return more than PHP_INT_MAX rows (not sure about handling the updates). Oracle seems to be returning uint as int which might lead to a negative number at the end.
I don't believe we have the integration tests that validate the semantics of these values in all cases (e.g. select, insert, update where all matching rows indeed were updated or some of them already had the value set by the update).
With that said, whoever has the commit permissions for this repo has my full blessing in doing whatever they deem reasonable.
I would appeal to pragmatism here as forcing everyone to deal with int|string when reality is 99.9999999% of users will ever see an int, seems absurd.
This appeal should be targeted at the maintainers of the underlying drivers first of all. BTW @kamil-tekiela contributed some changes that allowed to simplify the DBAL code for mysqli recently.
OK, then if you rather not solve anything, I would suggest leaving the php types as they are now, broken or not that they may be in edge cases, and switch to a simple @return int instead of int|string. That would not require any code change. It also reflects reality because drivers will return an int or throw a TypeError right now (as of 3.3.x code state I mean) in pathologic cases.
Removing the int<0, max> part makes it simpler due to drivers returning -1 and the fact that these driver methods are not properly typed in psalm/phpstan yet either..
If you agree I am happy to prepare a new PR (or @simPod updates this one here?). It'd be risk free in terms of code, but improves the type situation for API consumers which I believe was @simPod's intent as well.
There hasn't been any activity on this pull request in the past 90 days, so it has been marked as stale and it will be closed automatically if no further activity occurs in the next 7 days. If you want to continue working on it, please leave a comment.
This pull request was closed due to inactivity.
@doctrine what do you think about @Seldaek' comment
Yeah this should be reopened IMO. @morozov if you can let us know what you think about https://github.com/doctrine/dbal/pull/5317#issuecomment-1079180493 I'll happily send a PR.
I'll be fine with whatever changes that somebody else from the maintainers approves. After skimming over the thread, I don't want to work on this change.
Closing due to the lack of progress.
What is your feedback on https://github.com/doctrine/dbal/pull/5334 then?
I don't have any feedback on in, it's closed.