SqlRender icon indicating copy to clipboard operation
SqlRender copied to clipboard

Bigquery: GROUP BY error

Open mgkahn opened this issue 3 years ago • 9 comments

Two screenshots from SqlRender Developer showing GROUP BY arguments mapped incorrectly for BigQuery. Second screenshot shows me replacing column names with column numbers and the mapping is still incorrect. Postgres mapping was correct for both constructs. As best as my limited understanding allows, I do not think this is an error in the rewrite patterns but somewhere deep in the BG-specific java parsing code, well beyond me. If anybody has a suggestion how to rewrite the original (first screenshot/top panel) to process correctly thru TRANSLATE, please let me know. Thanks, Michael

image image

mgkahn avatar Apr 18 '21 19:04 mgkahn

@mgkahn Thank you for asking. Let us check and to propose solution.

konstjar avatar Apr 18 '21 19:04 konstjar

SqlRender currently does not process correctly numbers less or equal to number of arguments in select statement. max(case when ingredient_start_date=ingredient_start_date_new then 10 else 0 end) will be translated correctly

Or you can move case statement into inner select. Something like this:

select
	person_id,
	ingredient_start_date_new,
	max(original_ingredient)
into
	#regimens
from
	(
	select
		person_id,
		ingredient_start_date_new,
		case
			when ingredient_start_date = ingredient_start_date_new then 1
			else 0
		end as original_ingredient
	from
		#add_groups g) t
group by
	ingredient_start_date_new,
	person_id
order by
	ingredient_start_date_new;

ssuvorov-fls avatar Apr 19 '21 09:04 ssuvorov-fls

I am confused by this answer. The issue is the GROUP BY arguments not the CASE statement. SQLRender translate the GROUP BY as 2,3 rather than 2,1 like is done correctly for Postgres.

Are you saying the CASE statement throws off the GROUP BY translation?

Thanks for looking into this more. Michael

mgkahn avatar Apr 19 '21 12:04 mgkahn

No, it's bug in SqlRender. There're several iterations during translation. 1 iter - SqlRender searches for column names from group by clause in select clause and substitutes their position in select clause instead of their names in group by clause (so we get group by 2, 1 and your sql statement from first screenshot begins to look like the statement from the second screenshot) 2 iter - SqlRender performs this search again, finds 1 in your aggregate function and substitute its position (3) instead of 1. And your group by clause becomes group by 2, 3

ssuvorov-fls avatar Apr 19 '21 13:04 ssuvorov-fls

Thank you for the further insights. Not to be annoying but one last question (maybe!): why does the translation work correctly for Postgres but not BG. If I understand your explanation, the mis-processing should occur in both translations since it seems to be a DBMS independent part of the code. Why BigQuery specific?

mgkahn avatar Apr 19 '21 13:04 mgkahn

BigQuery uses its own translation mechanism due to differences of their dialects

ssuvorov-fls avatar Apr 19 '21 13:04 ssuvorov-fls

This is a problem for us as well. Because AllOfUs uses bigquery and we use AoU.

see replacement on a simple example

Why can't group by be kept as is and is changed to numbers? What replacement pattern (what row in that pattern file) is doing that change? (or what other process besides the pattern file)

image

vojtechhuser avatar Jul 12 '21 15:07 vojtechhuser

here is the actual query that crashes for us because of the rendering

image

vojtechhuser avatar Jul 13 '21 15:07 vojtechhuser

@ schumie it seems to be already fixed could you check it?

ssuvorov-fls avatar Jan 10 '23 08:01 ssuvorov-fls