Plan
Plan copied to clipboard
Merging databases
I would like to be able to..
Merge multiple databases so that the data from the databases are combined.
Is your feature request related to a problem? Please describe.
It is currently only possible to merge separate databases into one manually and with a lot of effort. A command that could automate the process would be invaluable for combining old servers with a lot of analytics data.
This is a complex feature to implement due to the nature of SQL. There are bound to be multiple conflicting rows. It might be possible for smaller databases to be merged manually, but larger ones with hundreds of thousands of rows would be near impossible. Automating this process could also lead to fewer mistakes when merging.
I am not very knowledgeable in SQL, so I can't tell if this is even possible, at least with the current database structure. It could be that this is only possible with a change to the database structure.
It is certainly possible, but the drawback and bottleneck is that duplicate detection needs to happen in memory, and everything being moved needs to be fetched into memory.
The datasets also need to be copied in batches to avoid server crash.
I figure algorithm to do this would be:
- Lock regular data input to both databases to avoid changes during move
- Take the rows that are crucial for ID purposes, create id lookup tables (stored in source database to optimize queries) and copy that data over
- Copy rest of the data with the help of lookup tables
- Cleanup lookup tables
The other thing to solve is combining two mysql databases since only one config exists for address input