MySqlConnector icon indicating copy to clipboard operation
MySqlConnector copied to clipboard

Unexpected breaking change regarding `MySqlDataReader.RecordsAffected` and stored procedures

Open lauxjpn opened this issue 3 years ago • 14 comments
trafficstars

Stored procedures that should have returned -1 now return 0 in MySqlConnector 2.1.6. Version 2.1.2 is the last version that still has the expected behavior.

This issue is failing 6 of our SqlExecutorMySqlTest tests:

SqlExecutorTestBase.Executes_stored_procedure
SqlExecutorTestBase.Executes_stored_procedure_async
SqlExecutorTestBase.Executes_stored_procedure_with_generated_parameter
SqlExecutorTestBase.Executes_stored_procedure_with_generated_parameter_async
SqlExecutorTestBase.Executes_stored_procedure_with_parameter
SqlExecutorTestBase.Executes_stored_procedure_with_parameter_async

The query we execute is a simple: CALL `Ten Most Expensive Products`() in SqlExecutorTestBase.Executes_stored_procedure.

Looks like MySqlDataReader.NextResultAsync() in MySqlCommand.ExecuteNonQueryAsync() ultimately changes RecordsAffected from -1 to 0.

lauxjpn avatar Feb 13 '22 23:02 lauxjpn

This is likely related to the fix for https://github.com/mysql-net/MySqlConnector/issues/1096 (possibly modified by https://github.com/mysql-net/MySqlConnector/issues/1122).

bgrainger avatar Feb 14 '22 01:02 bgrainger

https://dev.mysql.com/doc/internals/en/multi-resultset.html

The trailing OK_Packet is the response to the CALL statement and contains the affected-rows count of the last statement. In our case we inserted 2 rows, but only the affected_rows of the last INSERT statement is returned as part of the OK_Packet. If the last statement is a SELECT, the affected-rows count is 0.

It's that last piece that is tripping us up. If the sproc ended with an INSERT, its affected-rows count would be 1 (or greater) and that would be reported correctly. I'm struggling to think of a good way to distinguish a sproc that ends in UPDATE (which might report 0 rows) from a sproc that ends in SELECT and will always report 0.

bgrainger avatar Feb 14 '22 19:02 bgrainger

Here's a concrete example that demonstrates the issue.

SQL:

drop table if exists test;
create table test(data int);
insert into test(data) values(1);
drop procedure if exists test_proc;
create procedure test_proc()
begin
  select data from test;
end;

C#:

command.CommandText = @"select data from test;
update test set data = 2 where data = 3;";
var retval1 = command.ExecuteNonQuery(); // expect 0 (no rows updated)

command.CommandText = "call test_proc();";
var retval2 = command.ExecuteNonQuery(); // expect -1 (for SELECT)

Here are the returned packets for each query, which are almost exactly the same in both cases:

Packet Contents Remarks
1 Field count = 1 number of fields being returned
2 data metadata table, field names, data types, etc.
3 1 first row of SELECT
4 EOF has "more results" flag set
5 OK 0 rows affected; for compound SQL statement, also contains human-readable "Rows matched" message

For the compound SQL statement, the final "OK" packet is the result of executing the second statement, and contains the number of affected rows (in this case, 0). For the CALL statement, the final "OK" packet is just part of the MySQL text protocol for executing a sproc.

@lauxjpn I can't see a way (when parsing the protocol on the wire) to tell these scenarios apart and return 0 in one case but -1 in the other.

(We could put in a hack such as "ignore OK with 0 rows affected immediately after a result set: it could be the results of calling a sproc, and anyone who cares deeply about the results of ExecuteNonQuery could be warned in the documentation not to execute SELECT ...; UPDATE ... as a single command if they need to know the number of affected rows". However, that could really be an unexpected breaking change for some users.)

bgrainger avatar Feb 15 '22 05:02 bgrainger

The situation's actually a little worse than that. Consider these two different sproc bodies:

create procedure test_proc1()
begin
  select data from test;
end;
create procedure test_proc2()
begin
  select data from test;
  update test set data = 2 where data = 3;
end;

Assuming the UPDATE actually updates zero rows, executing the CALL test_proc() command will cause the exact same packets to be returned on the wire in both cases. There's no way to distinguish a sproc that ends in a SELECT from a sproc that ends in an UPDATE or DELETE that affects 0 rows.

The answer might have to be that due to limitations in the MySQL protocol, it's impossible for ExecuteNonQuery to return -1 when executing a sproc that ends with a SELECT.

@lauxjpn If that were the final answer, would this break anything in Pomelo, or would you just have to override those base class test cases to expect a different result for MySQL? That is, does any code depend on ExecuteNonQuery returning -1 here?

bgrainger avatar Feb 15 '22 06:02 bgrainger

If that were the final answer, would this break anything in Pomelo, or would you just have to override those base class test cases to expect a different result for MySQL? That is, does any code depend on ExecuteNonQuery returning -1 here?

I think that would generally not be a problem. The docs don't really specify anything in regards to sprocs in particular:

The number of rows changed, inserted, or deleted. -1 for SELECT statements; 0 if no rows were affected or the statement failed.

The only issue I see here is, that this is currently a breaking change without a major version number increment. If #1096 depends on this behavior, I think the previous (erroneous in regards to #1096) behavior should be the default until the next major release (the fix for #1096 could still explicitly be enabled with a flag or something).

lauxjpn avatar Feb 18 '22 13:02 lauxjpn

Every bug fix is a breaking change for people who were depending on the buggy behaviour. 😀 (See also https://xkcd.com/1172/.)

(Just looking at some of the most recent changes to MySqlConnector, I can easily imagine people who were relying on the insecure TLS downgrade behaviour in order to connect, code that assumed MySqlParameter.Size = 0, or that an exception had a particular substring in its .Message. That doesn't mean those bugs shouldn't have been fixed in a patch release.)

I don't believe this particular case merits a return (by default) to the buggy calculation, with the fix not coming until v3.0.

bgrainger avatar Feb 18 '22 13:02 bgrainger

Every bug fix is a breaking change for people who were depending on the buggy behaviour. 😀

That is true. But the primary issue I see here are not the users that are depending on the previously buggy behavior in regards to UPDATE commands, but the ones that have been using the non-buggy behavior of the sproc SELECT commands. As far as I understand, the fix for the UPDATE issue breaks the working sproc SELECT behavior.

I think that we shouldn't intentionally break existing functionality of working features with a minor or service release, but strive for SemVer2 compliance as much as possible. I think that it is fine to let users opt-in to breaking changes until the next major release.

The exception is of course, if releasing the fix as the default behavior is critical and therefore way more important than breaking existing and working code by users that were unaffected by the bug.

lauxjpn avatar Feb 20 '22 01:02 lauxjpn

but the ones that have been using the non-buggy behavior of the sproc SELECT commands

Right, I don't want to break anyone who has been using the correct return value of ExecuteNonQuery for stored procedures since v2.1.3.

No one should have been depending on ExecuteNonQuery returning -1 for a sproc that ended in a SELECT (including the EF tests) because:

  1. Wouldn't real-world code have called ExecuteReader anyway (to read the result set)?
  2. That was never documented as the correct behaviour.

Am I misunderstanding something?

bgrainger avatar Feb 20 '22 02:02 bgrainger

  1. Wouldn't real-world code have called ExecuteReader anyway (to read the result set)?

Probably yes, unless the code does not care about the final SELECT result (like the tests).


  1. That was never documented as the correct behaviour.

I did a quick check on how other databases seem to handle the sproc SELECT case in regards to the RecordsAffected return value in the same way as for ordinary SELECT statements. So you might want to check with the ADO.NET team in regards to that, since my personal interpretation of the docs might not be correct.

Anyway, for our case here this is unfortunately not relevant since the previous behavior just happened to return -1. Whatever the behavior was, independent of whether it lines up with the docs or not, we should not change the behavior for Pomelo inside of the same major release cycle (unless it is a bug) to keep SemVer2 compliance.


Right, I don't want to break anyone who has been using the correct return value of ExecuteNonQuery for stored procedures since v2.1.3.

While this is unfortunate, I think that unless something is critical, feature seniority should have priority in conflict situations like this. (Opting-in to the new behavior seems like a reasonable compromise to me thought.)


I think the standard and SemVer2 compliant solution to this issue here whould be to just branch off a 2.x version of MySqlConnector (or something similar like a service branch) so that users would have to make the concious decision to upgrade to a new major version. Pomelo 6.x would keep referencing the 2.x branch until Pomelo 7.x.

This is probably a good idea anyway, so that the current stable version of Pomelo is not holding back MySqlConnector's development and servicing versions of Pomelo can reference servicing versions of MySqlConnector to prevent unintended side effects.

What do you think?

lauxjpn avatar Feb 20 '22 02:02 lauxjpn

Firstly, I should probably state that I don't believe in strict SemVer. I think it's much more helpful to follow "loose" SemVer and increment the major version only when the change is expected to impact a lot of people. A 2.x to 3.x upgrade should happen when many users will be impacted and could need to update their code. A minor or patch version could still contain a breaking change (for users who depended on the old/bad/incorrect behaviour that was fixed) if it's expected that it would be few, if any users. (This isn't full "romantic versioning, but it's perhaps a little closer to that.)

Given that any change in observable behaviour is a breaking change, following strict SemVer would mean revving the major version very frequently, which I think is actually a disservice to most users, who might be concerned about a v2.x to v10.x upgrade, even though the eight "breaking changes" might be in incredibly niche areas. See also https://hynek.me/articles/semver-will-not-save-you/.

I understand that the Pomelo tests depended on -1, and that this change breaks the tests. However, I firmly believe that the previous behaviour was a bug that prevented the correct number of affected rows from being returned to users who did need that information. The fixed code now correctly reports the value of the OK payload that MySQL Server sends, which should be the expected behaviour by anyone using a MySQL library. (That's why I'm not yet persuaded by the idea of an option to control this: users shouldn't have to opt into bugfixes.)

I'm open to rolling back the change if you can show me how regular users of Pomelo (not just the tests) would be affected by this change. As you already seemed to agree, most real-world code (that uses sprocs) would be reading the full result set (if the sproc was a SELECT) and not using the DbDataReader.RecordsAffected property. (If they even had access to the MySqlDataReader; isn't that implementation detail hidden from most users?)

we should not change the behavior for Pomelo inside of the same major release cycle (unless it is a bug)

IMO it was a bug (that is now fixed); again, in what ways would the observable behaviour of Pomelo change, that would affect users?

I think the standard and SemVer2 compliant solution to this issue here whould be to just branch off a 2.x version of MySqlConnector (or something similar like a service branch) so that users would have to make the concious decision to upgrade to a new major version.

This is possible, but I can't see myself maintaining a branch to preserve buggy behaviour (which is I think what you're asking for). I could release a 2.1.8 that re-introduces the bug (and a 2.2.0 that fixes it again), but then Pomelo users would be stuck on 2.1.8 and likely to have no improvements from the 2.2.x series until Pomelo 7.x updates to it. That doesn't seem like a good thing for any consumers.

I want to do the right thing here for the users of both our libraries. If you absolutely can't update MySqlConnector in Pomelo 6.x until this behaviour changes, I could add an option (that Pomelo could set at startup?). But is this really a change that would affect any Pomelo end users?

bgrainger avatar Feb 20 '22 22:02 bgrainger

It is probably best to make sure first, that we are actually talking about the the same issue here, because we might not. So let's start from the root of the issue and move further out from there one step at a time:

From the point of view of this #1136 issue here that I opened, ~~a regression bug (~~ unexpected behavior change ~~)~~ was introduced after 2.1.2. For a specific SQL constellation in the code of an sproc, the MySqlDataReader.RecordsAffected property returned -1 until 2.1.2, but does not anymore after 2.1.2.

Do you agree with that assessment or not? (If you do, we can move one step outward, if you don't, we need to move one step inward.)

lauxjpn avatar Feb 20 '22 23:02 lauxjpn

For a specific SQL constellation in the code of an sproc, the MySqlDataReader.RecordsAffected property returned -1 until 2.1.2, but does not anymore after 2.1.2.

Yes, I agree.

(It's not a "regression bug" (which I would define as a change from correct to incorrect behaviour), it's a bug fix so that RecordsAffected matches the behaviour of MySQL Server, which as I've commented above is the only possible and reasonable thing to do. 😀)

Now, I have one for you...

Pomelo doesn't read the RecordsAffected property: https://github.com/PomeloFoundation/Pomelo.EntityFrameworkCore.MySql/search?q=RecordsAffected

Therefore, this change has no impact on the library or its users. Do you agree with that assessment?

bgrainger avatar Feb 21 '22 02:02 bgrainger

It's not a "regression bug" (which I would define as a change from correct to incorrect behaviour) [...]

That is fair.

Pomelo doesn't read the RecordsAffected property: https://github.com/PomeloFoundation/Pomelo.EntityFrameworkCore.MySql/search?q=RecordsAffected

That is correct.

Therefore, this change has no impact on the library or its users. Do you agree with that assessment?

That is not entirely correct. While there is no negative impact regarding Pomelo itself, there is in regards to users of Pomelo in the same way as there is for users of MySqlConnector.

Basically, whoever is checking RecordsAffected in their code against -1 will now get different behavior (and there is no warning or exception that leads to failure, so it will be hard to track down).


Okay, so we agree on the basic issue.

I am assuming we both agree that the reason for the unexpected change in behavior is the fix for #1096 (and possibly #1122) and that the fix for #1096 and fixing #1136 (by restoring the previous behavior) are mutually exclusive.

Now, for Pomelo 6.x, the next three important questions are:

  • Is #1096 an issue that has been existing in previous versions of MySqlConnector as well, or is it a recent one (basically, is it a regression bug or not)?
  • Is MySqlConnector significantly broken without the fix for #1096?
  • Is #1096 a deterministic or non-deterministic issue?

This will let us determine the priority between #1096 and #1136 in regards to SemVer2 from the view of Pomelo.

(For Pomelo 7.x, changing the behavior of RecordsAffected is absolutely fine, so no issue there.)

lauxjpn avatar Feb 21 '22 03:02 lauxjpn

#1096 has likely been a bug for a long time. (Which may indicate that "whoever is checking RecordsAffected in their code" is a very small group of people, otherwise they would have reported it already.)

MySqlConnector is not significantly broken without the fix for #1096, but there's no reason to believe that reverting the fix benefits many people (if any).

AFAIK #1096 is a deterministic issue.

In my estimation, the benefits of keeping the change still outweigh the potential cost of there being a breaking change for someone who's checking RecordsAffected and expecting it to be -1 for a sproc. (This does not seem like a common scenario to me.) It also doesn't meet my personal bar for being considered a significant breaking change that requires a major version bump; i.e., I'm not planning to re-release 2.1.3 as 3.0.0.

I do hope that Pomelo 6.x can use MySqlConnector >= 2.1.3 and that if there is a major breaking change in MySqlConnector (i.e., v3.0.0) we can align that release with Pomelo 7.x.

bgrainger avatar Mar 05 '22 18:03 bgrainger

I think we didn't really reach a resolution here, other than Pomelo 6.x will use v2.1.2 (before this change) and Pomelo 7.x will use 2.2.0 (which includes it).

bgrainger avatar Nov 12 '22 18:11 bgrainger