ora2pg icon indicating copy to clipboard operation
ora2pg copied to clipboard

Parallel Data Load with Table Partitions is Not Working [Range]

Open dhruv-dat opened this issue 4 years ago • 1 comments

Hi @darold

Hope you and your loved one's are safe.

We have been looking at enabling parallelism at partition level in Ora2Pg data load directly to PostgreSQL. This feature doesn't seem to be working though. Can you please let us know if this feature is currently available or this needs some work to be done?

Sample table with range partition that we have used - CREATE TABLE hr.testtable1 PARTITION BY RANGE (empl_id) ( PARTITION P1 VALUES LESS THAN (2500000), PARTITION P2 VALUES LESS THAN (5000000), PARTITION P3 VALUES LESS THAN (7500000), PARTITION P4 VALUES LESS THAN (MAXVALUE) ) AS ( SELECT LEVEL empl_id, MOD (ROWNUM, 50000) dept_id, Trunc (dbms_random.Value (1000, 500000), 2) salary, Decode (Round (dbms_random.Value (1, 2)), 1, 'M', 2, 'F') gender, To_date (Round (dbms_random.Value (1, 28)) || '-' || Round (dbms_random.Value (1, 12)) || '-' || Round (dbms_random.Value (1900, 2010)), 'DD-MM-YYYY') dob, dbms_random.String ('x', dbms_random.Value (20, 50)) address, LEVEL empl_id1, MOD (ROWNUM, 50000) dept_id1, Trunc (dbms_random.Value (1000, 500000), 2) salary1, Decode (Round (dbms_random.Value (1, 2)), 1, 'M', 2, 'F') gender1, To_date (Round (dbms_random.Value (1, 28)) || '-' || Round (dbms_random.Value (1, 12)) || '-' || Round (dbms_random.Value (1900, 2010)), 'DD-MM-YYYY') dob1, dbms_random.String ('x', dbms_random.Value (20, 50)) address1, LEVEL empl_id2, MOD (ROWNUM, 50000) dept_id2, Trunc (dbms_random.Value (1000, 500000), 2) salary2, Decode (Round (dbms_random.Value (1, 2)), 1, 'M', 2, 'F') gender2, To_date (Round (dbms_random.Value (1, 28)) || '-' || Round (dbms_random.Value (1, 12)) || '-' || Round (dbms_random.Value (1900, 2010)), 'DD-MM-YYYY') dob2, dbms_random.String ('x', dbms_random.Value (20, 50)) address2, LEVEL empl_id3, MOD (ROWNUM, 50000) dept_id3, Trunc (dbms_random.Value (1000, 500000), 2) salary3, Decode (Round (dbms_random.Value (1, 2)), 1, 'M', 2, 'F') gender3, To_date (Round (dbms_random.Value (1, 28)) || '-' || Round (dbms_random.Value (1, 12)) || '-' || Round (dbms_random.Value (1900, 2010)), 'DD-MM-YYYY') dob3, dbms_random.String ('x', dbms_random.Value (20, 50)) address3 FROM dual CONNECT BY LEVEL < 10000000);

We have been using the following Ora2Pg command, ours is 20 vCore machine with 130 GB Memory. - ora2pg -c /etc/ora2pg/ora2pg.conf -t COPY --namespace HR -a 'TABLE[testtable]' -J 5 -j 3 -L 25000 -d --oracle_speed --ora2pg_speed

As you can see the following, only one partition is loading at a time. Can we do something to invoke them all in parallel? [========================>] 1/1 tables (100.0%) end of scanning. [======> ] 25000/99999 rows (25.0%) Table P3 (3 sec., 8333 recs/sec) [======> ] 25000/99999 rows (25.0%) Table P2 (2 sec., 12500 recs/sec) [======> ] 25000/99999 rows (25.0%) Table P1 (2 sec., 12500 recs/sec) [=====> ] 24999/99999 rows (25.0%) Table P4 (3 sec., 8333 recs/sec) [========================>] 99999/99999 rows (100.0%) on total estimated data (10 sec., avg: 9999 tuples/sec)

Could you also please let us know some best performance indicators that would help migrate relatively large volumes of data.

Thanks

dhruv-dat avatar Aug 21 '20 06:08 dhruv-dat

Unfortunately I do not have a lot of spare time in 2019-2020, I will have some holidays in September but I'm not sure that this topic will be in top priority.

darold avatar Aug 21 '20 06:08 darold