datajoint-python icon indicating copy to clipboard operation
datajoint-python copied to clipboard

New type of restriction: top rows.

Open dimitri-yatsenko opened this issue 3 years ago • 0 comments
trafficstars

Feature Request

Problem

Currently, DataJoint does not allow restricting a query to the top entries according to some sorting except for the final fetch operation.

top_students = Student.fetch(order_by='grade desc', limit=10)

This is equivalent to SQL

SELECT * FROM student ORDER BY grade desc LIMIT 10

In many cases it is useful to perform this type of restriction inside an intermediate query.

Requirements

I propose a new restriction type, dj.Top().

For example, the query for top students would look like:

top_students = Student & dj.Top('grade desc', 10)

dj.Top would take the order_by attributes, the limit, and the offset.

Justification

dj.Top allows for new query types that are possible in SQL but have not been easily produced in DataJoint.

For example, the query list the majors for top students:

top_students = Student & dj.Top('grade desc', 10)
StudentMajor & top_students

Alternative Considerations

The operation above could have been done the old way as

top_students = Student.fetch("KEY", order_by='grade desc', limit=10)
StudentMajor & top_students

However, this is two separate queries and the new dj.Top allows producing the result as a single query.

dimitri-yatsenko avatar May 13 '22 23:05 dimitri-yatsenko