project-m36 Consider making a library of relational shortcuts

Typing out the relational algebra operators can be tedious. We could make it less tedious by integrating with Haskell language features such as infix operators for binary operators and optionally tighter type checking (with attribute names) for common cases. Examine vinyl-records for type-checking.

Aug 31 '16 04:08 agentm

There are also some newer attempts at the record problem which hold a lot of promise: https://hackage.haskell.org/package/rawr and https://hackage.haskell.org/package/bookkeeper just came up as you probably know.

Aug 31 '16 04:08 3noch

infix operators and optionally tighter type checking (with attribute names) sounds interesting. Sounds we need a tutd ADT first?

Jun 06 '18 04:06 YuMingLiao

Hm, this need not be related to the tutd grammar, though you could convince me otherwise.

I envisioned adopting type strategies from other packages such as beam which could be compiled to DatabaseContextExpr and RelationalExpr.

Jun 06 '18 13:06 agentm

Good idea!

let boundedQuery :: Q SqliteSelectSyntax _ _ _
    boundedQuery = limit_ 1 $ offset_ 1 $
                   orderBy_ (asc_ . _userFirstName) $
                   all_ (_shoppingCartUsers shoppingCartDb)

I am, too, looking forward that DatabaseContextExpr and RelationalExpr in haskell can be as simple as something like this.

Jun 11 '18 02:06 YuMingLiao

I am starting to explore adding a typed interface around project-m36, so I figured I should leave some thoughts here. I'm not 100% sure if what I'm trying to do matches with the expectations of this issue, but I'll give a quick precis and you can let me know!

Basically, what I want is to be able to provide functions like

insert :: (IsDbRelVar db a) => Connection db -> a -> m () 
 
fetch :: (IsDbRelVar db a) => Connection db -> m [a]

where db is some type that represents all possible relvar types in the current database connection.

I also want to be able to provide functions like

queryBy :: (IsDbRelVar db a, IsField a b) => Connection db -> b -> Value b -> m [a]
... query producing `[a]` where some field `b` is equal to some Value b

So far I've implemented a poc for the first set of functionality (inserting and requesting etc) and I'll be working on the second (querying / field referencing) shortly.

At the moment the only major issue I have is that my way of ensuring that the current database does indeed have a schema that matches the db is to derive and create the schema at the point where a Connection db is created. This could be improved on with some story of, first, schema comparison between an existing database and the db type, emitting a runtime error, and, even better, a coherent story for well typed migration (something like safecopy maybe).

Jun 11 '18 22:06 matchwood

I should maybe add that the goal would be to have some type level combinators (like servant) that would allow typing of more complex queries.

Jun 11 '18 22:06 matchwood

We support some of your suggested functions already with the Tupleable typeclass. Check out the blog example.

It would be pretty easy to cook up a function like queryBy that fits with Tupleable today- that would, in fact, be a nifty contribution.

The complication is in making composable, type-safe functions based on the Haskell type so that, for example, accessing an attribute through the query which is not defined in the Haskell type results in a GHC compile-time error rather than a Project:M36 RelationalError. That's the sort of thing that libraries like beam offer and what we lack currently.

Jun 11 '18 22:06 agentm

I think we are at cross purposes here - the blog example doesn't do anything like this. There is no typing whatsoever in any part of it. As a simple example, the line insertBlogsExpr <- handleError $ toInsertExpr blogs "blog" does not do any type checking at all. Would I get a compile error if I wrote insertBlogsExpr <- handleError $ toInsertExpr blogs "blogtypoblog" What about if I tried to insert some other instance of Tupleable that we haven't actually added to the database?

The complication is in making composable, type-safe functions based on the Haskell type so that, for example, accessing an attribute through the query which is not defined in the Haskell type results in a GHC compile-time error rather than a Project:M36 RelationalError. That's the sort of thing that libraries like beam offer and what we lack currently.

That is exactly what I was talking about.

Jun 11 '18 22:06 matchwood

Perhaps my original comment wasn't clear enough

type IsDbRelVar db a = (Elem db a, Tupleable a)
insert :: (IsDbRelVar db a) => Connection db -> a -> m () 
fetch :: (IsDbRelVar db a) => Connection db -> m [a]

The point is if you call insert with any other type than something that is in the db type (which might look like '[Person, Address, Purchase]) then you would get a compile time error.

Exactly the same thing for queryBy. I want an interface where you can provide a type level witness of some field, whereby the compiler can guarantee a) that the record is part of the db, b) that the field is part of the record and c) that the Value b is the same type as the field.

Jun 11 '18 23:06 matchwood

So I have made considerable progress on this front, but I'm currently stalling a bit after running into the performance issue documented in #210 .

So far I have got to the point where you can define your schema as a type, like so:

type AppSchema = (
  RelVar User :$  
    '[UniqueConstraint '["userFirstName", "userLastName"],
      ForeignConstraint '["userAddress"] (RelVar Address), "addressId"]]
  :& RelVar Address 
  :& RelVar PhoneNumber
  )

and my code will

Give compilation errors if you make any mistakes in the schema (e.g. reference a field in a UniqueConstraint that doesn't actually exist)
Generate the equivalent databasecontextexprs for creating this schema
Allow you to insert and fetch records defined in the schema, with compilation errors if the relvar is not in the schema
(poc) Allow you to write fully type checked relational expressions (a subset at the moment) for basic querying purposes. For example, if you Project, Rename, or Extend over a relvar, you will get a compilation error if you reference a field that the relvar does not contain.

If there is any interest I can pull out the relevant code and create a demo repo for it.

Jun 18 '18 18:06 matchwood

I should add that the code is based entirely around generics, generics-sop and a bunch of closed type families, with no need for template haskell.

Jun 18 '18 19:06 matchwood

hi @matchwood It sounds awosome! I definitely have interest in seeing and learning. Please create a repo for it.

Jun 19 '18 00:06 YuMingLiao

Wow! That's really exciting! I, too, would be interested in trying it out.

Jun 19 '18 03:06 agentm

Ok good, I will try to get a demo repo up in a week or two - I'll leave a comment here when I manage it!

Jun 26 '18 23:06 matchwood

Well... it was a little bit longer, but here you are https://github.com/matchwood/project-m36-typed .

I have actually moved on a bit from project-m36, as I ran into some performance issues that make it unsuitable for the kind of use case I was envisaging (see #210). But I am still generally very supportive of the project, and am happy to continue discussion and so forth if anyone likes the approach I've taken in my typed interface code.

Aug 07 '18 12:08 matchwood

Thanks! I'll take a look.

I saw your posting proposing a new acid-world which looks promising. I hope that we can cross-pollinate ideas and performance results. For example, I have been examining cborg for suitability, but it's difficult to determine how well new hackage modules are maintained.

Aug 09 '18 13:08 agentm

Yes, that would be good!

Just a bit of background for acid-world - I'm strongly in favour of expressing as much as possible at the type level. Haskell gives us this amazing opportunity to construct compile time guarantees for all kinds of things, and I still don't think it has been fully leveraged. Servant is the kind of library that I think everything should be moving towards. That was what motivated me to try to make a typed interface for project-m36.

At the same time, I need to actually use these projects in contexts where performance matters. Acid-world is really just a layer of typing over the same concept as acid-state, and therefore is pretty performative. I haven't done any optimisation on it yet, but inserts are around 400 μs, and because it is event based that performance is pretty much O(1) (I have benchmarks at 100k and 1m existing records).

On the other hand, I really like the underlying concept of project-m36, as it is a much more principled and general approach to the problem of persistence than acid-state or acid-world.

On the issue of cborg specifically, I am inclined to trust it. Among other things, DCoutts (the primary author) is a lynchpin of the Haskell community, and it is supported by Well Typed (who I'm pretty sure are using it in production in their own projects).

Aug 09 '18 15:08 matchwood

It would seem the discussion went into a wee bit of a tangential direction? Back to the main (I think?) topic, which is "make relational algebra expressions less tedious to type out". Perhaps not an ideal solution, but I suppose you already have a TutorialD to RelationalExpr parser. So why not add a parsing quasi-quoter to (optionally) write more complicated relational algebra expressions in TutD? That seems like quite a sizeable benefit in exchange for very little work. While at it, perhaps adding a full-fledged DatabaseContextExpr quasi-quoter might also be nice.

This wouldn't by any means preclude from adding an EDSL at some point, of course. But looking at the current progress, I don't think this is likely to happen very soon?

Nov 30 '18 19:11 lierdakil

Sure, a TutorialD-to-expression quasi-quoter seems like a relatively low-hanging fruit- it might also help in writing our automated tests. Thanks for the suggestion!

However, I would be hesitant to recommend such a feature for general usage because it smacks of the SQL problem- statements that can't compose, string escape chaos- but feel free to convince me otherwise!

That's why I would like to see something more along the lines of matchwood's project-m36-typed- just straight Haskell. I think there are a bunch of TH-free database libraries from which to draw inspiration.

Dec 01 '18 18:12 agentm

statements that can't compose

Technically since QQ is TH, I think we could, say, use haskell-src-meta to add context-capturing Haskell slices to the TutD syntax (for instance, similar to how TH's default QQ does it, e.g. [|some Haskell expression $(embedded expression slice) continued|])

So, for example, the hair example could look like:

  let blond = NakedAtomExpr (toAtom Blond)
  eCheck $ executeDatabaseContextExpr sessionId conn
      [tutdctx|people := relation{tuple{hair $blond, name "Colin"}}|]

(notice the $ before blond).

This would require some potentially painful changes to the parser though, so I have my doubts about the viability of this approach.

string escape chaos

Not much of an issue, generally speaking. The only thing QQ doesn't allow inside a quotation is |] IIRC, and adding a simple pre-processing rule that would, say, replace |~] with |] in the captured string isn't hard (more generally, removing one ~ from substring |~~...~~] -- or choose literally any other character instead of ~)

That's why I would like to see ... just straight Haskell.

Well, sure, I'm with you on that. But it seems like a lot more work. And frankly right now I want a quick and relatively easy way around writing out an extremely verbose AST by hand, not to create a complete relational EDSL from scratch.

... Anyway, I've written some proof of concept code. Had to add Language.Haskell.TH.Syntax.Lift instances for all AST types, but that's just annoying, not hard (I've added those straight to Base, but it might be better to collect orphan instances in a separate module to avoid unnecessary TH proliferation). The rest is mostly trivial. Currently this allows for code like this (using the same hair example):

  eCheck $ executeDatabaseContextExpr sessionId conn
    [tutdctx|people := relation{
      tuple{hair Blond, name "Colin"},
      tuple{hair OtherColor "Grey", name "Greg"}
    }|]

  peopleRel <- eCheck $ executeRelationalExpr sessionId conn
    [tutdrel|people where hair = Blond|]

Note there is no compile-time typechecking going on here, so Blond could be Bogus, and stuff would still compile and then happily crash at runtime. Also note that it's in general impossible to know at compile-time whether Blond is even defined in the DB context or not, so this shouldn't be at all surprising.

EDIT: To be clear, general TutD syntax errors are caught at compile-time.

Will try to get it on GitHub in a day or two, if you're interested. It's not yet here because the commit history is a bit of a mess, it's nigh impossible to understand where the changes are exactly (since I've had to move TutorialD.Interpreter hierarchy into the library for obvious reasons). Will also try to at least implement the |] escaping thing.

Dec 02 '18 01:12 lierdakil

Okay, I've published the proof-of-concept to the qq-parser branch of my fork. Commits of interest are

46aeb9b8739cc6abdcf00c2ca5b030766f91934e, which adds just quasiquoters and
aeef9f0b12d434a03809f3e3d2d31f00215bc2dd, which adds limited support for Haskell variable splicing inside the quasiquoters. (edit: sorry about the whitespace noise, the relevant changes with the noise filtered out are here)

The latter adds a constructor into AST, but doesn't amend AST consumers, which creates some partial functions due to pattern-matching not being total in a few places. Frankly I couldn't figure out a more reasonable way to do this which wouldn't involve rewriting/rewiring the AST and/or the parser entirely. One option for handling this more gracefully would be to parametrize the AST on yet another type, representing the "placeholder", say Placeholder, make a typeclass class PlaceholderHandler with instances for Placeholder and Void, and extend relevant AST nodes with an alternative for type parameter representing the placeholder. Then use (e.g.) SYB-style generic programming to convert between AST representations when needed. I might give it a shot later.

This is more or less enough for my immediate needs, so I'm unlikely to work much more on this in the near future (a bit of a crunch time going on currently)

Dec 04 '18 10:12 lierdakil