prql icon indicating copy to clipboard operation
prql copied to clipboard

A tail function that would take the last n results

Open qharlie opened this issue 3 years ago • 1 comments

We could use something like

LIMIT n OFFSET ROW_COUNT-n

If we could get the ROW_COUNT for the last transformation , which I was thinking might be possible if we took the entire generated query and used it with a COUNT subquery to get ROW_COUNT

qharlie avatar Jun 17 '22 13:06 qharlie

This was said on discord:

aljazerzen — 14/06/2022

DataFrame libraries usually do have head and tail, but I think this goes against the "only one way of doing things" so i would be better to refrain from adding so similar functions

max-sixty — 14/06/2022

I think it could still be useful — "reverse sort and use head" is not that easy for a user.

My main concern is that it might be difficult to implement, and it's unclear what to do when there's no sorting — would we just take head with the assumption that sorting isn't guaranteed?

I also agree that in SQL databases, order of unsorted table is not guaranteed, so we can just take head.

Regarding the implementation:

head is actually exactly the same as take

from employees
sort salary
tail 100

could be expanded to:

from employees
sort [-salary]
head 100
sort salary

which would translate to:

WITH (
  SELECT * FROM employees
  ORDER BY salary DESC -- inverse the sort here
  LIMIT 100 -- take head
) table_0 
SELECT table_0.* FROM table_0
ORDER BY salary ASC -- now correct the order

aljazerzen avatar Jun 18 '22 19:06 aljazerzen