FrameworkBenchmarks Loosen Update verification

Since the update is random values; perhaps lower the threshold a little (e.g. 95%)

e.g. the last tfb run failed Rails for updating too few rows (1 short) https://tfb-status.techempower.com/unzip/results.2020-07-12-19-02-35-234.zip/results/20200708175354/rails/update/verification.txt

--------------------------------------------------------------------------------
VERIFYING UPDATE
--------------------------------------------------------------------------------
   FAIL for http://10.0.0.1:8080/update?queries=20
     Only 20479 executed queries in the database out of roughly 20480 expected.
     See https://github.com/TechEmpower/FrameworkBenchmarks/wiki/Project-Information-Framework-Tests-Overview#specific-test-requirements
   PASS for http://10.0.0.1:8080/update?queries=20
     Rows read: 10240/10240
     See https://github.com/TechEmpower/FrameworkBenchmarks/wiki/Project-Information-Framework-Tests-Overview#specific-test-requirements
   PASS for http://10.0.0.1:8080/update?queries=20
     Rows updated: 10237/10240
     See https://github.com/TechEmpower/FrameworkBenchmarks/wiki/Project-Information-Framework-Tests-Overview#specific-test-requirements

Jul 15 '20 03:07 benaadams

Hi @benaadams,

We're currently reviewing the verification steps for all database tests.

For the record, this looks like a failure to execute all the required queries: FAIL: Only 20479 executed queries in the database out of roughly 20480 expected. But the updating of rows does allow for a margin of error: PASS: Rows updated: 10237/10240

Jul 15 '20 16:07 NateBrady23

This may be a bug to the MySQL implementation.

I have checked the code and we do have a margin of 1.5% for the solo-update check (the last PASS), but it is not applied to the first check where both SELECT and UPDATE statements are counted.

What this verification output suggests to me is that one of either an update or select was not executed (or counted by MySQL). We can confirm that at least all 10240 SELECT statements were executed as confirmed by the second PASS, and this should not have the margin applied to it but does.

Ignoring for the moment the third PASS's counts, what I don't understand is how the first FAIL has the numbers it does. I tested updating the same row with the same value, and saw that Show global status where Variable_name = 'Com_update'; does indeed increment, so it isn't that a double-update occurred.

Why, then, would we give a margin at all?

Jul 15 '20 16:07 msmith-techempower

For now, for each discrepancy found on the number of requests, there was a solution on the framework side. With Rails, it seems that there is no update performed if the object properties are unchanged. By making sure to generate a different randomNumber than the loaded instance, I have no more problems.

see random number generation for Rails update

Jul 17 '20 13:07 jcheron

@jcheron would you like to make a PR?

It causes the composite scores to fail when it fails (as Rails is one of the frameworks in the composite scores)

Jul 17 '20 14:07 benaadams

There is also a bug in the toolset.

Currently, the MySQL database verifier uses the following query to test that the correct number of SELECT and UPDATE calls were made:

Show global status where Variable_name in ('Com_select','Com_update');

However, this seems to run into the issue that we are seeing here. If you issue the following two queries:

update world set randomnumber=15 where id = 55;
update world set randomnumber=15 where id = 55;

then the result of the Show global... query will show an increment of 1. It seems that MySQL has some optimization where it checks whether the write would result in a no-op and drops that statement.

I changed the query to the following to test the same update scenario:

SELECT variable_name, variable_value from PERFORMANCE_SCHEMA.SESSION_STATUS where Variable_name = 'Innodb_rows_read';

and it seems that it works around that issue nicely. While the second update is still dropped as a no-up, it does a read instead of a write, so the number of queries is still correct and we check equality without the margin.

EDIT:

With Rails, it seems that there is no update performed if the object properties are unchanged.

Actually, I looked at the diff you linked and this does appear to be working around the MySQL issue I describe here, and it does not seem to be a Rails-specific issue; rather, a luck of the draw. One could work around it in code, but we don't impose that rule for any of the tests, and what's more Postgres and MongoDB don't have this additional work imposed upon them either.

Jul 17 '20 15:07 msmith-techempower

That's strange. In Rails code, there is no more reason for the same query to be executed twice than in other frameworks (which don't have this problem).

On the other hand, in the past, I have already made the modification related to the choice of different randomNumbers (for Act, Symfony, aspcore...)

Jul 17 '20 16:07 jcheron

I am not saying that that Rails implementation is wrong, but it is doing more work. It's doing N more loops and checking equality that many times before doing the work we want to measure. So, it's not zero effort whereas the implementation for Postgres does not require it - if you write the same value, Postgres either drops it or does the write, but it increments the count either way, so you don't have to do the extra work.

That's why my PR changes how we count the queries to use a mechanism more like MySQL - if you actually executed the update statement on your DB driver and it ended up as a no-op as an optimization, great... it's still counted now. You did the work we asked you to and should be benchmarked.

Jul 17 '20 16:07 msmith-techempower

Aaaaaaaaaaaand as soon as I hit the 'comment' button, @nbrady-techempower tells me that my PR fixed nothing. So, I'm back to the drawing board.

Jul 17 '20 16:07 msmith-techempower

The rails failure was as much about PostreSQL as it was about Mysql...

It's true that this random number thing generates a little extra effort. Nothing prevents adding in the specifications that the modified random number must be different from the original one. That way all frameworks are on equal footing.

On the other hand, the current specifications do specify that a certain number of selects and updates must be done, and in this case Rails does not respect this clause.

Jul 17 '20 17:07 jcheron

On the other hand, the current specifications do specify that a certain number of selects and updates must be done, and in this case Rails does not respect this clause.

I'm missing something here - are you suggesting that the ORM Rails is using is programmatically avoiding queries to the database that it believes to be a no-op? Like "oh, you're trying to set the randomnumber to 15; it is 15... I'm going to do nothing."

If that's the case, then yeah, Rails will need your PR merged.

Jul 17 '20 17:07 msmith-techempower

I'm missing something here - are you suggesting that the ORM Rails is using is programmatically avoiding queries to the database that it believes to be a no-op? Like "oh, you're trying to set the randomnumber to 15; it is 15... I'm going to do nothing."

Yes, I believe it is. Hibernate and Doctrine do the same thing.

Careful, when you start making the frameworks talk like they're human ;-) is that maybe the day is too long.

Jul 17 '20 17:07 jcheron

Well, that's both disappointing and alarming. There are a few things at play, it seems.

First, I think @nbrady-techempower and I need to figure out the "queries made to the database count" issue. I think my PR fixes it, but we need Travis to do the legwork.

Second, it sounds like we need another verification for this specific edge case. Something like "count the number of queries, select any single row, update it's value to the same value, count the number of queries again, verify that it is incremented the expected number of times." This would catch the ORMs that are optimizing out no-op writes.

Third, implement the sort of fix as your #5882 across the other implementations which have this ORM optimization.

Jul 17 '20 18:07 msmith-techempower

I think that all ORMS that practice this optimization have had failures and have corrected their code. The current query control signals this problem, almost systematically ( and surprisingly, since the assignment of the new randomNumber is random, but it is a pseudo-random computational one)

Jul 17 '20 18:07 jcheron

Agreed... having given this more than 0.1sec of thought now, you're right. I'm not sure how I would verify this in practice.

Jul 17 '20 18:07 msmith-techempower

In the same spirit, there is also the case of selecting an instance already loaded in the same http request (since the choice of elements to load from the database is random). Some ORMS do not load the same object twice and had to adapt their code.

It would be possible to check these 2 cases, as you suggested earlier in this thread.

Loading the same 2 instances in an Http request
Updating an instance whose properties have not been modified

but in this case, each framework should implement methods to do so (methods then requested by the toolset), and not sure what good it would do to know that.

[edit] It would be work for everyone, for very little. The current verification does not report these cases explicitly, but does not let them pass either by generating failures for missing queries. Considering the marginal aspect of these cases, that's not bad. [/edit]

Jul 17 '20 18:07 jcheron

We (linq2db orm) generate following query for update:

UPDATE
	`world` `w`
		INNER JOIN (
			SELECT 2887 AS `id`, 4610 AS `randomnumber` FROM DUAL
			UNION ALL
			SELECT 1935, 1451 FROM DUAL
			UNION ALL
			SELECT 6052, 1018 FROM DUAL
			UNION ALL
			SELECT 55, 6675 FROM DUAL
			UNION ALL
			SELECT 8702, 7253 FROM DUAL
			UNION ALL
			SELECT 3268, 6379 FROM DUAL
			UNION ALL
			SELECT 202, 6208 FROM DUAL
			UNION ALL
			SELECT 1996, 3261 FROM DUAL
			UNION ALL
			SELECT 9145, 5959 FROM DUAL
			UNION ALL
			SELECT 7619, 7591 FROM DUAL
			UNION ALL
			SELECT 648, 1051 FROM DUAL
			UNION ALL
			SELECT 3579, 9208 FROM DUAL
			UNION ALL
			SELECT 3443, 564 FROM DUAL
			UNION ALL
			SELECT 2212, 549 FROM DUAL
			UNION ALL
			SELECT 6592, 1958 FROM DUAL
			UNION ALL
			SELECT 7168, 2634 FROM DUAL
			UNION ALL
			SELECT 7084, 777 FROM DUAL
			UNION ALL
			SELECT 5644, 6004 FROM DUAL
			UNION ALL
			SELECT 8504, 7750 FROM DUAL
			UNION ALL
			SELECT 555, 8902 FROM DUAL)  `r` ON `w`.`id` = `r`.`id`
SET
	`w`.`randomnumber` = `r`.`randomnumber`

which results in failed test

   FAIL for http://tfb-server:8080/mvc/updates/linq2db?queries=20
     Only 10395 executed queries in the database out of roughly 20480 expected.
     See https://github.com/TechEmpower/FrameworkBenchmarks/wiki/Project-Information-Framework-Tests-Overview#specific-test-requirements
   WARN for http://tfb-server:8080/mvc/updates/linq2db?queries=20
     5120013 rows read in the database instead of 10240 expected. This number is excessively high.
     See https://github.com/TechEmpower/FrameworkBenchmarks/wiki/Project-Information-Framework-Tests-Overview#specific-test-requirements
   PASS for http://tfb-server:8080/mvc/updates/linq2db?queries=20
     Rows updated: 10225/10240
     See https://github.com/TechEmpower/FrameworkBenchmarks/wiki/Project-Information-Framework-Tests-Overview#specific-test-requirements
{'Date': 'Sat, 28 May 2022 11:38:26 GMT', 'Transfer-Encoding': 'chunked', 'Content-Type': 'application/json; charset=utf-8', 'Server': 'Kestrel'}
[{"id":2667,"randomNumber":1665}]

One issue I can identify is that Com_update used for stats gathering, but this query updates Com_update_multi variable (by 1 per query). This probably a reason of FAIL as update queries not counted

Regarding 5 mil records WARN it is not clear for me where it comes from as I don't observe nothing near to it in session status.

SELECT variable_name, variable_value from PERFORMANCE_SCHEMA.SESSION_STATUS where Variable_name = 'Innodb_rows_read';

SELECT r.variable_value-u.variable_value FROM 
                        (SELECT variable_value FROM PERFORMANCE_SCHEMA.SESSION_STATUS where Variable_name like 'Innodb_rows_read') r,
                        (SELECT variable_value FROM PERFORMANCE_SCHEMA.SESSION_STATUS where Variable_name like 'Innodb_rows_updated') u

both generate +40 records counter

May 28 '22 12:05 MaceWindu

FrameworkBenchmarks FrameworkBenchmarks copied to clipboard

Loosen Update verification

FrameworkBenchmarks
FrameworkBenchmarks copied to clipboard