byzer-lang Two tables join, and the columns will be misaligned.

Two tables join, and the columns will be misaligned.

Open AdmondGuo opened this issue 3 years ago • 1 comments

Here is my code:

select 
    owner,
    owner_email,
    owner_mgr,
    owner_mgr_email,
    week_begin,
    actual_hour,
    working_days*8 as working_hour
from(
    select 
        owner,
        owner_email,
        owner_mgr,
        owner_mgr_email,
        week_begin,
        sum(ts_hour) as actual_hour
    from workload_union
    where owner_active=1
    group by owner, owner_email, owner_mgr, owner_mgr_email, week_begin
) ht left join week_calendar wc on ht.week_begin = wc.week
as hour_table;

select
    tpe as tpe,
    wl.description,
    item as item,
    ky_project_id as ky_project_id,
    ky_project_name as ky_project_name,
    wl.ky_customer_id as ky_customer_id,
    occur_date as occur_date,
    ...
    outlier as outlier,
from 
workload_union wl left join hour_table ht on wl.week_begin = ht.week_begin and wl.owner_email = ht.owner_email
as workload3;

I expec description to appear in the second column.But It always appear at the end of columns. It may occured becaused spark RDD sorting.

Jan 28 '22 08:01 AdmondGuo

Look at this: https://stackoverflow.com/questions/52434075/scala-spark-order-changes-when-writing-a-dataframe-to-a-csv-file

And you can fix this by use ET TableRepartition. HERE is the doc: https://docs.byzer.org/#/byzer-lang/zh-cn/extension/et/TableRepartition?id=%e8%a1%a8%e5%88%86%e5%8c%ba%e6%8f%92%e4%bb%b6-tablerepartition

Feb 23 '22 10:02 AdmondGuo

byzer-lang byzer-lang copied to clipboard

Two tables join, and the columns will be misaligned.

byzer-lang
byzer-lang copied to clipboard