owid-grapher icon indicating copy to clipboard operation
owid-grapher copied to clipboard

refactor(db): use mysql2 as a db driver

Open marcelgerber opened this issue 9 months ago • 5 comments

Let's use mysql2 as a DB driver within knex.

Advantages:

  • It's automatically caching prepared statements, which should make some things a bit faster
  • It supports passing JSON into queries (without the need for JSON.stringify), and will automatically convert any JSON columns into a JS object (without the need for JSON.parse)
    • The latter I have disabled currently, because we have many code paths that expect the result as a JSON string. We can enable it in the future.

TODOs

  • [ ] Test this some more
    • Especially the "other" interactions with the DB, e.g. typeorm migrations and wpDb (?)
  • [ ] Run a full bake on master, then on this branch, and check the diff

marcelgerber avatar May 02 '24 10:05 marcelgerber

Quick links (staging server):

Site Admin Wizard

Login: ssh owid@staging-site-db-mysql2

  • Site-screenshots: https://github.com/owid/site-screenshots/compare/db-mysql2
  • SVG tester: https://github.com/owid/owid-grapher-svgs/compare/db-mysql2
SVG tester:

Number of differences (default views): 0 Number of differences (all views): 0

Edited: 2024-05-16 14:54:22 UTC Execution time: 1.37 seconds

owidbot avatar May 02 '24 10:05 owidbot

Nice! I think in theory there could be cases where perf is a bit worse because the json parsing and serializing cost is paid a bit more often. I'm almost sure that this is not going to be noticable, just want to mention it as something to keep in the back of your mind.

It might be good to do a bake with master and then with the same db content another bake from this branch and make sure that they are identical - just to make sure that we don't run into any discrepancies between the two libraries.

The automatic json decoding looks nice and it could simplify our code quite a bit if we wouldn't have to have the raw/enriched distinction in so many places.

Let me know if you want me to test this as well or if you want to chat about anything related to this next week!

danyx23 avatar May 02 '24 14:05 danyx23

Just wanted to mention here that currently, we don't do any unnecessary JSON parses or serializations - since we have the typeCast present, which overrides the default behaviour of the driver.

The one other difference between these drivers I came across is:

One known incompatibility is that DECIMAL values are returned as strings whereas in Node MySQL they are returned as numbers. This includes the result of SUM() and AVG() functions when applied to INTEGER arguments. This is done deliberately to avoid loss of precision.

However, I didn't find any DECIMAL columns, nor any SUM() or AVG() calls, in our code.

marcelgerber avatar May 02 '24 14:05 marcelgerber

I think we could merge this after some more testing - the main question is how and when should we decide if we want to enable the automatic json conversion :). Maybe just note it somewhere for a potential cooldown project?

Should we test and ship this in the coming cooldown?

danyx23 avatar May 14 '24 09:05 danyx23

Yes, taking a look at this again in the cooldown sounds great!

marcelgerber avatar May 17 '24 08:05 marcelgerber

This PR has had no activity within the last two weeks. It is considered stale and will be closed in 3 days if no further activity is detected.

github-actions[bot] avatar Jun 01 '24 07:06 github-actions[bot]

I have now run a full bake on both master and this branch and compared them using Beyond Compare, and indeed the outputs of both bakes are the same.

marcelgerber avatar Jun 17 '24 16:06 marcelgerber

@danyx23 Do you want to test this some, too? I have run some more tests, and all's good from my side:

  • Bake on master and this branch doesn't have any diff, or errors
  • TypeORM migrations run and revert fine
  • All JSON fields received in knex & TypeORM arrive as JSON strings

marcelgerber avatar Jun 17 '24 16:06 marcelgerber