pgcopydb
pgcopydb copied to clipboard
Fix memory allocation issue in transform
The transform module adds the COMMIT LSN information to every transactions, and for that it keeps the whole current transaction in-memory. When given a large transaction as input (many large rows) then we can experience OOM killer being triggered on the process. To avoid that problem, we could write large transactions in their own ${xid}.sql
files and implement an -- INCLUDE
mechanism in our pgcopydb SQL format. The INCLUDE statement would only be added to the main SQL contents once the transaction-specific SQL file would be available on-disk.
This approach should help solve both memory allocation problems and also PIPE sync situations/deadlocks where the PIPE reader is blocked waiting for a COMMIT LSN we don't have yet.
The implementation should include a heuristic that determines if a transaction is large enough to warrant this special treatment. It could be something like number or rows / total byte size with limits such as 64 rows and/or 128MB.
Here is an example of a SQL statement to run on the source database that triggers the memory allocation problems in the docker-compose environment:
insert into foo select x, repeat('a', x) from generate_series(1, 60000) as t(x);
This new approach might interrfer with https://github.com/dimitri/pgcopydb/pull/505 in a way that the per-transaction SQL file would then be produced by the transform stage, the apply parts would only need to implement parsing for the new INCLUDE command and then replay from file.
@dimitri I'm wondering this would also require to change the way we read files in other places right? For example, the below code reads the entire file content to memory to transform it.
https://github.com/dimitri/pgcopydb/blob/dfa87c2b6fad358a1fff254a1df56d2f95422bad/src/bin/pgcopydb/ld_transform.c#L826-L830