datajoint-python
datajoint-python copied to clipboard
Support for parsing DataJoint query expressions independent of client language
Feature Request
Problem
Currently, DataJoint clients rely on operator precedence to parse DataJoint query expressions to properly generate SQL statements. Though many operators have similar operator precedence between languages, several operators do not agree between languages. For instance:
- Element-wise multiplication:
- Python:
* - MATLAB:
.*
- Python:
- Matrix multiplication:
- Python:
@ - MATLAB:
*
- Python:
Over time, this will ultimately present a lack of consistency from user perspective in how to generate equivalent DataJoint query expressions between clients requiring language-specific details. Therefore, I propose we provide an additional means of supplying an indirect expression to be parsed and evaluated by DataJoint explicitly.
Perhaps:
dj.exp('Session * Subject & %s', dict(subject_name='AB_12'))
OR
dj.query('Session * Subject & %s', dict(subject_name='AB_12'))
Requirements
- Provide a language-agnostic means of parsing and evaluating DataJoint expressions.
- Users should learn a single DataJoint convention/syntax and be able to have confidence that it is interpreted similarly betweeen clients.
Justification
This would allow a standard order of operations independent of client language i.e. a sense of continuity to user. Additionally, it ensures arguments are parsed consistently between any client implementation. This also presents an adoptable solution for languages that do not allow operator overloading such as JavaScript.
Alternative Considerations
Currently the only means to have a consistent query expression structure between clients is verbose and increasingly convoluted in large queries since order of evaluation has to be explicitly defined. My above example, for instance, would be:
Session.join(Subject).restrict(dict(subject_name='AB_12')))
Related Errors
Errors would mainly arise from unintentional misuse of query expression syntax between languages that may occur over time 'silently' and lead to a need to recompute data once/if discovered (potential implications on published results, etc.). All this from an approach that is not always intuitive to user.
Additional Research and Context
- Reference Python 'overloadable' operators
- Reference MATLAB 'overloadable' operators
- StackOverflow question on operator overloading in JavaScript