dbplyr icon indicating copy to clipboard operation
dbplyr copied to clipboard

`distinct()` in Databricks/SparkSQL causes "arrange()... __row_num_*" error

Open fabkury opened this issue 11 months ago • 0 comments

From my perspective, this error started happening at some point in the past few weeks.

Merely calling dplyr::distinct() on a lazy (remote) tibble gives:

Error in arrange(., !!sym(row_num)):
1 In argument:  `__row_num_a46479cf_8586_4003_b032_d43e0bc6c4d1`
Caused by error:
! Object `__row_num_a46479cf_8586_4003_b032_d43e0bc6c4d1` not found.
Error in arrange():

That arrange(., !!sym(row_num)) is not from my script.

I am able to circumvent the problem by doing a trivial group_by() then keeping only the group keys.

Thanks for the awesome software.

fabkury avatar Mar 20 '24 19:03 fabkury