datajoint-python
datajoint-python copied to clipboard
New type of restriction: top rows.
Feature Request
Problem
Currently, DataJoint does not allow restricting a query to the top entries according to some sorting except for the final fetch operation.
top_students = Student.fetch(order_by='grade desc', limit=10)
This is equivalent to SQL
SELECT * FROM student ORDER BY grade desc LIMIT 10
In many cases it is useful to perform this type of restriction inside an intermediate query.
Requirements
I propose a new restriction type, dj.Top().
For example, the query for top students would look like:
top_students = Student & dj.Top('grade desc', 10)
dj.Top would take the order_by attributes, the limit, and the offset.
Justification
dj.Top allows for new query types that are possible in SQL but have not been easily produced in DataJoint.
For example, the query list the majors for top students:
top_students = Student & dj.Top('grade desc', 10)
StudentMajor & top_students
Alternative Considerations
The operation above could have been done the old way as
top_students = Student.fetch("KEY", order_by='grade desc', limit=10)
StudentMajor & top_students
However, this is two separate queries and the new dj.Top allows producing the result as a single query.