gpdb
gpdb copied to clipboard
gpload tempopary table generated without considering order of distribution keys
Bug Report
When gpload.py utility degenerates temporary table for loading data it uses array of distibuted keys generated by get_table_dist_key method, which provide list of distibution keys without considering its order. That's why temporary table and target table could have different structure and update won't work optimazted (for example would scan all table partitions even only few of them are actuall affected).
Expected behavior
there is order by clause in SQL query of get_table_dist_key method:
order by position(concat(' ',a.attnum::text,' ') in concat(' ',p.distkey::text,' '));
Actual behavior
there are no order by clause
Step to reproduce the behavior
you can use tables like below to verify current and new query:
create table public.test_table_check_dist_case1
(
col1 int4,
col2 int4,
col3 int4,
col4 text
)
distributed by (col2, col1, col3)
create table public.test_table_check_dist_case2
(
col1 int4 default '0',
col2 int4,
col3 int4,
col4 text
)
distributed by (col1, col2, col3)