musicbrainz-server icon indicating copy to clipboard operation
musicbrainz-server copied to clipboard

Move some common database import subroutines to `Script::Utils`

Open mwiencek opened this issue 11 months ago • 0 comments

Problem

  • We have some identical or very similar-looking subroutines duplicated between admin/MBImport.pl and admin/replication/ImportReplicationChanges: empty (to tell if a table is empty) and ImportTable (to import a table from a file). The two implementations of the latter function differ slightly.

  • For the incremental dumps (MusicBrainz::Script::Role::IncrementalDump), I'd like to reuse the ImportTable function to import a dbmirror2 packet into some temporary tables (to parse it).

    We currently parse the old packets by hand; I'm surprised it works at all (it probably doesn't in some cases) because I don't believe it fully or properly implement's parsing of PostgreSQL's COPY text format. dbmirror2 packets contain JSON, which may contain JSON escape sequences underneath COPY's text format escape sequences, and the mixing of these can be tricky to parse. It's much, much easier to let PostgreSQL do the parsing!

Solution

This just adds two shared subroutines, is_table_empty and copy_table_from_file to Script::Utils, and modifies admin/MBImport.pl and admin/replication/ImportReplicationChanges to use them.

Testing

We have existing automated tests that make heavy use of these scripts.

I did not test the --fix-broken-utf8 or --ignore-errors flags specifically.

mwiencek avatar Mar 08 '24 17:03 mwiencek