dbplyr
dbplyr copied to clipboard
`distinct()` in Databricks/SparkSQL causes "arrange()... __row_num_*" error
From my perspective, this error started happening at some point in the past few weeks.
Merely calling dplyr::distinct()
on a lazy (remote) tibble gives:
Error in arrange(., !!sym(row_num)):
1 In argument: `__row_num_a46479cf_8586_4003_b032_d43e0bc6c4d1`
Caused by error:
! Object `__row_num_a46479cf_8586_4003_b032_d43e0bc6c4d1` not found.
Error in arrange():
That arrange(., !!sym(row_num))
is not from my script.
I am able to circumvent the problem by doing a trivial group_by()
then keeping only the group keys.
Thanks for the awesome software.