dhallql
dhallql copied to clipboard
An evaluation strategy
The README sketches out a desired query language, but is it even possible to give it an efficient interpretation? I think it is! My rough thoughts for evaluating dhallql queries is as follows.
First, observe that any information flows out from the Root type. There is no way external data can arrive other than through things provided by Root, so this will be the key to start evaluation. First, normalize the given Dhall query. Next, for each Root function, find all applications of that function to an expression with no free-variables. These are queries that can be immediately performed. From the readme, we have:
(root.user { name = "ocharles" })
which is of the form we need. We would go off and find the user "ocharles", and we can now normalize root.user { name = "ocharles" } into a new free variable and store a mapping of this free variable to the database-retrieved value.
Next, we continue to walk the tree of queries by finding x.repositories expressions that are applied to an expression with no free variables, where x : User. Whenever we see this, we do the same thing as before, but the query is ran in a context where we also provide x itself. So, when we see user.repositories < first = 100 > we would call a getUserRepositories :: User -> Limit -> Database [Repository] where the User is x, and the Limit is < first = 100 > (note that `this expression has no free variables).
Everytime a database query is performed, you normalize, and by the totality properties of Dhall we will eventually end up with a normal form that can be sent back to the client.
I think this evaluation strategy has plenty of potential to be fast. For example, in the first pass we're looking for root.user calls, but there's no reason to perform them one-at-a-time. We can turn multiple root.user calls into a single database query and then redistribute the results back into Dhall's normaliser.
Likewise, it's probably possible to replace the monadic-like evaluation with lateral database joins. That is, root.user { name = "ocharles" }.repositories is a JOIN of user and repository.
As one would hope, all of this cleverness is server-side, and not part of the theory that client's have to understand.
Does this mean the purpose of this project is to translate a subset of dhall expressions to database queries?
Not really database queries. dhallql is just trying to map some Dhall expressions to IO actions, essentially. It could be that root.user { name = "ocharles" }.repositories actually just does ls /repositories/ocharles.
Like graphql, dhallql is just trying to provide a way for clients to talk to an API server in a way that lets client's specify exactly what they want.
But theorically if it can be mapped to generic IO Actions it could be used for database queries too, right? The mapping could be to select * from ROOT.USER U left join ROOT.REPOSITORY R on R.USER_ID = U.ID where U.NAME = 'ocharles'.
Of course! That's likely to be the most typical usage.
Cool :) So the next line of questioning:
There will be many such mappings, like
dqldbQuery : DhallQL → IOdbQuery
or
dqlfilesystem : DhallQL → IOfilesystem
where DhallQL is a fixed subset of dhall expressions.
dhallql seems to be proposing what that subset should be. If this is true, then I think the following would make sense:
Given some f : DhallQL → DhallQL, there should be some g : IOdbQuery → IOfilesystem such that g ○ dqldbQuery = dqlfilesystem ○ f
More metaphorically, DhallQL is a common vocabulary for asking questions, and relationships between questions should map to relationships between answers (or, 'ways of producing answers').
Are my assumptions correct and does this seem reasonable?
(Meta question: is this the right place to be having this discussion?)
Hmmm, dhallql is less about defining mappings, but more about a "shape" of expressions that are amenable to interpretation in a server. That's why we have the Root -> A "shape" - the server defines what the implementation of Root is, and the client gets only to use it. I'm not sure it goes quite as far as your question suggests, but maybe it does!
Perhaps wait until I've thrown together some playground code, and see if your property holds - and if it doesn't - could it?
This issue is mostly about dumping thoughts for an evaluation strategy, but with the infancy of the project I don't really mind where discussion goes. Feel free to open new issues if you have targeted questions.
@LightAndLight What you're saying makes sense (that dhallql is a natural transformation). Here's an example:
λr. r.user{name="bob"} -----------------> SELECT *
FROM "users"
WHERE "name" = 'bob'
| |
| λq r. (q r).repos | F (λq r. (q r).repos)
v v
λr. r.user{name="bob"}.repos -----------------> SELECT *
FROM "users", "repos"
WHERE "users"."name" = 'bob'
AND "users"."id" = "repos"."user"
I believe the normal way to use a QL like this is to just always construct the entire query you want to make (the bottom left corner), and ship it all to the server to plan and execute.
However, should the server have some F with which it can use to map query transformations, it's possible the server would instead prefer to accept both the partial query (top left corner) and query transformation (function along the left side), for optimizing a single "query session" on the same user data, or something.
@ocharles Neat idea, I'm excited to see where it's going