haskelldb Potential Bug When Joining Same Table to Another Multiple Times

I already posted this to the mailing list, but I haven't heard back after two weeks, so I'll give GitHub a shot here.

Apologies up front as I'm not the most experienced with SQL or with relational algebra, so maybe I'm doing this completely incorrectly. Nonetheless, I'm getting behavior that doesn't seem to be right. When I (inner) join a table A twice to another table B and SELECT the same column from A in both joins, HaskellDB doesn't differentiate between which column came from the first table and which came from the second.

I've attached a minimum working example (which also has some Haddock documentation that you can run docstring on to verify my comments in the code) which contains comments describing the problem I'm seeing. I apologize for the gigantic spew of code, but HaskellDB usually requires quite a bit of boilerplate code to get all the table definitions.

{-# LANGUAGE TypeOperators #-}

import Database.HaskellDB.HDBC
import Database.HaskellDB.DBLayout
import Database.HaskellDB.Sql.Default (defaultSqlGenerator, mkSqlGenerator)
import Database.HaskellDB.Query
import Database.HaskellDB.HDBRec
import Database.HaskellDB.PrintQuery

data PersonId = PersonId

instance FieldTag PersonId where
    fieldName = const "PersonId"

personIdAttr :: Attr PersonId Int
personIdAttr = mkAttr PersonId

data Name = Name

instance FieldTag Name where
    fieldName = const "Name"

nameAttr :: Attr Name String
nameAttr = mkAttr Name

data ItemId = ItemId

instance FieldTag ItemId where
    fieldName = const "ItemId"

itemIdAttr :: Attr ItemId Int
itemIdAttr = mkAttr ItemId

data Price = Price

instance FieldTag Price where
    fieldName = const "Price"

priceAttr :: Attr Price Int
priceAttr = mkAttr Price

-- As Chris Done describes for better readibility in
-- http://chrisdone.com/posts/haskelldb-and-typeoperator-madness
type a :=>: b = RecCons a b
infixr 2 :=>:

type a :<: b = a :=>: (Expr b)
infixr 2 :<:

type a :+: b = a b
infixr 1 :+:

type Person =
    PersonId :<: Int    :+:
    Name     :<: String :+:
    RecNil

type Item =
    ItemId   :<: Int :+:
    Price    :<: Int :+:
    PersonId :<: Int :+:
    PersonId :<: Int :+:
    RecNil

peopleTable :: Table Person
peopleTable = baseTable "peopleTable"
    $ hdbMakeEntry PersonId
    # hdbMakeEntry Name

itemsTable :: Table Item
itemsTable = baseTable "itemsTable"
    $ hdbMakeEntry ItemId
    # hdbMakeEntry Price
    # hdbMakeEntry PersonId
    # hdbMakeEntry PersonId

-- | >>> ppSql itemsAndNames
-- SELECT Price1 as Price,
--        Name2 as Name,
--        Name2 as Name
-- FROM (SELECT PersonId as PersonId2,
--              Name as Name2
--       FROM peopleTable as T1) as T1,
--      (SELECT Price as Price1,
--              PersonId as PersonId1,
--              PersonId as PersonId1
--       FROM itemsTable as T1) as T2,
--      (SELECT PersonId as PersonId3,
--              Name as Name3
--       FROM peopleTable as T1) as T3
-- WHERE ((PersonId1) = (PersonId3)) AND ((PersonId1) = (PersonId2))
--
-- This result is not what I wanted. I wanted the following:
--
-- SELECT Price1 as Price,
--        Name2 as Name,
--        Name3 as SomeOtherName -- This part is different!
-- FROM (SELECT PersonId as PersonId2,
--              Name as Name2
--       FROM peopleTable as T1) as T1,
--      (SELECT Price as Price1,
--              PersonId as PersonId1,
--              PersonId as PersonId1
--       FROM itemsTable as T1) as T2,
--      (SELECT PersonId as PersonId3,
--              Name as Name3
--       FROM peopleTable as T1) as T3
-- WHERE ((Name2) = 'Tom Smith') AND ((PersonId1) = (PersonId3)) AND ((PersonId1) = (PersonId2))

itemsAndNames = do
    items <- table itemsTable
    buyers <- table peopleTable
    sellers <- table peopleTable
    restrict $ items ! personIdAttr .==. buyers ! personIdAttr
    restrict $ items ! personIdAttr .==. sellers ! personIdAttr
    project
        $ priceAttr << items ! priceAttr
        # nameAttr  << buyers ! nameAttr
        # nameAttr  << sellers ! nameAttr

Just to be clear that it's not the fact that I'm using the same types that are ruining this, if I make entirely new types, it still doesn't work.

-- Maybe because it's using the types that this screws up? 
-- After all I can't really expect to do x ! nameAttr on this and get
-- a sensible result.

data BuyerName = BuyerName

instance FieldTag BuyerName where
    fieldName = const "Name"

buyerNameAttr :: Attr BuyerName String
buyerNameAttr = mkAttr BuyerName

data SellerName = SellerName

instance FieldTag SellerName where
    fieldName = const "Name"

sellerNameAttr :: Attr SellerName String
sellerNameAttr = mkAttr SellerName

data BuyerId = BuyerId

buyerIdAttr :: Attr BuyerId Int
buyerIdAttr = mkAttr BuyerId

instance FieldTag BuyerId where
    fieldName = const "PersonId"

data SellerId = SellerId

sellerIdAttr :: Attr SellerId Int
sellerIdAttr = mkAttr SellerId

instance FieldTag SellerId where
    fieldName = const "PersonId"

type Buyer =
    BuyerName :<: String :+:
    BuyerId :<: Int :+:
    RecNil

type Seller =
    SellerName :<: String :+:
    SellerId :<: Int :+:
    RecNil

buyersTable :: Table Buyer
buyersTable = baseTable "peopleTable"
    $ hdbMakeEntry BuyerName
    # hdbMakeEntry BuyerId

sellersTable :: Table Seller
sellersTable = baseTable "peopleTable"
    $ hdbMakeEntry SellerName
    # hdbMakeEntry SellerId

-- | So let's try again
--
-- >>> ppSql itemsAndNames2
-- SELECT Price1 as Price,
--        Name2 as Name,
--        Name2 as Name
-- FROM (SELECT Name as Name2,
--              PersonId as PersonId2
--       FROM peopleTable as T1) as T1,
--      (SELECT Price as Price1,
--              PersonId as PersonId1,
--              PersonId as PersonId1
--       FROM itemsTable as T1) as T2,
--      (SELECT Name as Name3,
--              PersonId as PersonId3
--       FROM peopleTable as T1) as T3
-- WHERE ((PersonId1) = (PersonId3)) AND ((PersonId1) = (PersonId2))
--
-- Nope still have the same problem.
itemsAndNames2 = do
    items <- table itemsTable
    buyers <- table buyersTable
    sellers <- table sellersTable
    restrict $ items ! personIdAttr .==. buyers ! buyerIdAttr
    restrict $ items ! personIdAttr .==. sellers ! sellerIdAttr
    project
        $ priceAttr << items ! priceAttr
        # buyerNameAttr  << buyers ! buyerNameAttr
        # sellerNameAttr  << sellers ! sellerNameAttr

main :: IO ()
main = (print $ ppSql itemsAndNames) >> (print $ ppSql itemsAndNames2)

Jan 27 '15 22:01 changlinli

Hi Changlin,

HaskellDB is absolutely riddled with bugs and no one seems to be interested in fixing them. However, nowadays there exists a far superior SQL-generating library for Haskell called Opaleye (even if I do say so myself (I'm the author)). If you are using Postgres you will probably find Opaleye more useful for you. It hasn't been designed or tested for any other RDBMS, but it may work with others.

https://github.com/tomjaguarpaw/haskell-opaleye

Feel free to email me if you have any questions. My contact details are in the README.

Jan 27 '15 22:01 tomjaguarpaw

@tomjaguarpaw "absolutely riddled" might be a little strong.

@changlinli Sorry I missed your email on the mailing list.

Because your first example reuses nameAttr, I don't know if that's really a bug. The second certainly is, though.

If you are interested, take a look in the tests directory and you'll see a ton of queries in TestCases.hs. That might help you fix this bug (you can start by adding your test case to show that the bug exists).

I'd be glad to merge a patch that fixes this.

Jan 27 '15 22:01 m4dc4p

@m4dc4p I think "Absolutely riddled" is balanced and reasonable. There are these four critical bugs that make HaskellDB essentially unusable for writing composable queries, and they have gone unaddressed for up to a year and a half.

https://github.com/m4dc4p/haskelldb/issues/15 https://github.com/m4dc4p/haskelldb/issues/17 https://github.com/m4dc4p/haskelldb/issues/18 https://github.com/m4dc4p/haskelldb/issues/22

Optimize.hs is a rats nest where bugs could be hiding on every other line, and indeed I have found a few there. It has bitrotted beyond repair.

Jan 27 '15 22:01 tomjaguarpaw

@tomjaguarpaw I've been following Opaleye with interest and am very happy that you've been able to get approval to release the code to the community. I'll definitely be using it for at least one personal project at some point. Unfortunately I am using HaskellDB on a project that requires MS SQL and HaskellDB + HDBC-ODBC has been the best solution I've seen so far (although both have been a rather poorly documented road to go down). If you have ideas on how Opaleye could be adapted to fit in a Microsoft environment, I'd be very happy to hear them.

To be clear I have gained a good deal of use from HaskellDB and so I'm not convinced that it is beyond repair. Many thanks to @m4dc4p and the other stewards of HaskellDB that have guided it through its life from its beginnings as an academic/student project. It is somewhat unfortunate to learn that this is a bug and not me being dumb.

I'll try my best to grok the code and see if I can submit a patch for this particular bug. I may bother you @m4dc4p some more for pointers as I go through the code. I anticipate it'll be quite a while before I understand enough to submit a meaningful patch though.

Jan 29 '15 03:01 changlinli

-- SELECT Price1 as Price,
--        Name2 as Name,
--        Name3 as Name -- This part is different!

I guess you meant Name3 as <SomeNewUniqueName>

Jan 29 '15 10:01 tomjaguarpaw

Yep, that's right. I miswrote the comment. I've edited the original.

Feb 01 '15 21:02 changlinli

@changlinli I pushed a bug fix branch - still needs some refactoring, but (at least for me) it fixes the issue you described. Check it out: https://github.com/m4dc4p/haskelldb/tree/m4dc4p/25-duplicate-column-failure.

Notice I added some test cases in test/TestCases.hs that address the problem:

testDupCol1
testDupCol2

Feb 06 '15 01:02 m4dc4p

Apologies for the lateness of my reply; things got a bit busy on my end. Unfortunately, it looks like I'm still running into problems, I'll have some examples up when I get time today.

Feb 15 '15 21:02 changlinli

haskelldb haskelldb copied to clipboard

Potential Bug When Joining Same Table to Another Multiple Times

haskelldb
haskelldb copied to clipboard