oracle2postgres
oracle2postgres copied to clipboard
Faster method for migrating data in batches
The current approach (_copy_data
) for loading data from the source database relies on limit + offset, which is slow. Replace it with a faster approach, perhaps using yield_per or a window function. e.g.:
for data in source_session.query(table).yield_per(batchsize):
_insert_data(target_session,source_schema,table,data)
See:
- https://stackoverflow.com/questions/7389759/memory-efficient-built-in-sqlalchemy-iterator-generator
- https://stackoverflow.com/questions/1145905/sqlalchemy-scan-huge-tables-using-orm