haskelldb
haskelldb copied to clipboard
Potential Bug When Joining Same Table to Another Multiple Times
I already posted this to the mailing list, but I haven't heard back after two weeks, so I'll give GitHub a shot here.
Apologies up front as I'm not the most experienced with SQL or with relational algebra, so maybe I'm doing this completely incorrectly. Nonetheless, I'm getting behavior that doesn't seem to be right. When I (inner) join a table A twice to another table B and SELECT the same column from A in both joins, HaskellDB doesn't differentiate between which column came from the first table and which came from the second.
I've attached a minimum working example (which also has some Haddock documentation that you can run docstring on to verify my comments in the code) which contains comments describing the problem I'm seeing. I apologize for the gigantic spew of code, but HaskellDB usually requires quite a bit of boilerplate code to get all the table definitions.
{-# LANGUAGE TypeOperators #-}
import Database.HaskellDB.HDBC
import Database.HaskellDB.DBLayout
import Database.HaskellDB.Sql.Default (defaultSqlGenerator, mkSqlGenerator)
import Database.HaskellDB.Query
import Database.HaskellDB.HDBRec
import Database.HaskellDB.PrintQuery
data PersonId = PersonId
instance FieldTag PersonId where
fieldName = const "PersonId"
personIdAttr :: Attr PersonId Int
personIdAttr = mkAttr PersonId
data Name = Name
instance FieldTag Name where
fieldName = const "Name"
nameAttr :: Attr Name String
nameAttr = mkAttr Name
data ItemId = ItemId
instance FieldTag ItemId where
fieldName = const "ItemId"
itemIdAttr :: Attr ItemId Int
itemIdAttr = mkAttr ItemId
data Price = Price
instance FieldTag Price where
fieldName = const "Price"
priceAttr :: Attr Price Int
priceAttr = mkAttr Price
-- As Chris Done describes for better readibility in
-- http://chrisdone.com/posts/haskelldb-and-typeoperator-madness
type a :=>: b = RecCons a b
infixr 2 :=>:
type a :<: b = a :=>: (Expr b)
infixr 2 :<:
type a :+: b = a b
infixr 1 :+:
type Person =
PersonId :<: Int :+:
Name :<: String :+:
RecNil
type Item =
ItemId :<: Int :+:
Price :<: Int :+:
PersonId :<: Int :+:
PersonId :<: Int :+:
RecNil
peopleTable :: Table Person
peopleTable = baseTable "peopleTable"
$ hdbMakeEntry PersonId
# hdbMakeEntry Name
itemsTable :: Table Item
itemsTable = baseTable "itemsTable"
$ hdbMakeEntry ItemId
# hdbMakeEntry Price
# hdbMakeEntry PersonId
# hdbMakeEntry PersonId
-- | >>> ppSql itemsAndNames
-- SELECT Price1 as Price,
-- Name2 as Name,
-- Name2 as Name
-- FROM (SELECT PersonId as PersonId2,
-- Name as Name2
-- FROM peopleTable as T1) as T1,
-- (SELECT Price as Price1,
-- PersonId as PersonId1,
-- PersonId as PersonId1
-- FROM itemsTable as T1) as T2,
-- (SELECT PersonId as PersonId3,
-- Name as Name3
-- FROM peopleTable as T1) as T3
-- WHERE ((PersonId1) = (PersonId3)) AND ((PersonId1) = (PersonId2))
--
-- This result is not what I wanted. I wanted the following:
--
-- SELECT Price1 as Price,
-- Name2 as Name,
-- Name3 as SomeOtherName -- This part is different!
-- FROM (SELECT PersonId as PersonId2,
-- Name as Name2
-- FROM peopleTable as T1) as T1,
-- (SELECT Price as Price1,
-- PersonId as PersonId1,
-- PersonId as PersonId1
-- FROM itemsTable as T1) as T2,
-- (SELECT PersonId as PersonId3,
-- Name as Name3
-- FROM peopleTable as T1) as T3
-- WHERE ((Name2) = 'Tom Smith') AND ((PersonId1) = (PersonId3)) AND ((PersonId1) = (PersonId2))
itemsAndNames = do
items <- table itemsTable
buyers <- table peopleTable
sellers <- table peopleTable
restrict $ items ! personIdAttr .==. buyers ! personIdAttr
restrict $ items ! personIdAttr .==. sellers ! personIdAttr
project
$ priceAttr << items ! priceAttr
# nameAttr << buyers ! nameAttr
# nameAttr << sellers ! nameAttr
Just to be clear that it's not the fact that I'm using the same types that are ruining this, if I make entirely new types, it still doesn't work.
-- Maybe because it's using the types that this screws up?
-- After all I can't really expect to do x ! nameAttr on this and get
-- a sensible result.
data BuyerName = BuyerName
instance FieldTag BuyerName where
fieldName = const "Name"
buyerNameAttr :: Attr BuyerName String
buyerNameAttr = mkAttr BuyerName
data SellerName = SellerName
instance FieldTag SellerName where
fieldName = const "Name"
sellerNameAttr :: Attr SellerName String
sellerNameAttr = mkAttr SellerName
data BuyerId = BuyerId
buyerIdAttr :: Attr BuyerId Int
buyerIdAttr = mkAttr BuyerId
instance FieldTag BuyerId where
fieldName = const "PersonId"
data SellerId = SellerId
sellerIdAttr :: Attr SellerId Int
sellerIdAttr = mkAttr SellerId
instance FieldTag SellerId where
fieldName = const "PersonId"
type Buyer =
BuyerName :<: String :+:
BuyerId :<: Int :+:
RecNil
type Seller =
SellerName :<: String :+:
SellerId :<: Int :+:
RecNil
buyersTable :: Table Buyer
buyersTable = baseTable "peopleTable"
$ hdbMakeEntry BuyerName
# hdbMakeEntry BuyerId
sellersTable :: Table Seller
sellersTable = baseTable "peopleTable"
$ hdbMakeEntry SellerName
# hdbMakeEntry SellerId
-- | So let's try again
--
-- >>> ppSql itemsAndNames2
-- SELECT Price1 as Price,
-- Name2 as Name,
-- Name2 as Name
-- FROM (SELECT Name as Name2,
-- PersonId as PersonId2
-- FROM peopleTable as T1) as T1,
-- (SELECT Price as Price1,
-- PersonId as PersonId1,
-- PersonId as PersonId1
-- FROM itemsTable as T1) as T2,
-- (SELECT Name as Name3,
-- PersonId as PersonId3
-- FROM peopleTable as T1) as T3
-- WHERE ((PersonId1) = (PersonId3)) AND ((PersonId1) = (PersonId2))
--
-- Nope still have the same problem.
itemsAndNames2 = do
items <- table itemsTable
buyers <- table buyersTable
sellers <- table sellersTable
restrict $ items ! personIdAttr .==. buyers ! buyerIdAttr
restrict $ items ! personIdAttr .==. sellers ! sellerIdAttr
project
$ priceAttr << items ! priceAttr
# buyerNameAttr << buyers ! buyerNameAttr
# sellerNameAttr << sellers ! sellerNameAttr
main :: IO ()
main = (print $ ppSql itemsAndNames) >> (print $ ppSql itemsAndNames2)
Hi Changlin,
HaskellDB is absolutely riddled with bugs and no one seems to be interested in fixing them. However, nowadays there exists a far superior SQL-generating library for Haskell called Opaleye (even if I do say so myself (I'm the author)). If you are using Postgres you will probably find Opaleye more useful for you. It hasn't been designed or tested for any other RDBMS, but it may work with others.
https://github.com/tomjaguarpaw/haskell-opaleye
Feel free to email me if you have any questions. My contact details are in the README.
@tomjaguarpaw "absolutely riddled" might be a little strong.
@changlinli Sorry I missed your email on the mailing list.
Because your first example reuses nameAttr
, I don't know if that's really a bug. The second certainly is, though.
If you are interested, take a look in the tests
directory and you'll see a ton of queries in TestCases.hs
. That might help you fix this bug (you can start by adding your test case to show that the bug exists).
I'd be glad to merge a patch that fixes this.
@m4dc4p I think "Absolutely riddled" is balanced and reasonable. There are these four critical bugs that make HaskellDB essentially unusable for writing composable queries, and they have gone unaddressed for up to a year and a half.
https://github.com/m4dc4p/haskelldb/issues/15 https://github.com/m4dc4p/haskelldb/issues/17 https://github.com/m4dc4p/haskelldb/issues/18 https://github.com/m4dc4p/haskelldb/issues/22
Optimize.hs is a rats nest where bugs could be hiding on every other line, and indeed I have found a few there. It has bitrotted beyond repair.
@tomjaguarpaw I've been following Opaleye with interest and am very happy that you've been able to get approval to release the code to the community. I'll definitely be using it for at least one personal project at some point. Unfortunately I am using HaskellDB on a project that requires MS SQL and HaskellDB + HDBC-ODBC has been the best solution I've seen so far (although both have been a rather poorly documented road to go down). If you have ideas on how Opaleye could be adapted to fit in a Microsoft environment, I'd be very happy to hear them.
To be clear I have gained a good deal of use from HaskellDB and so I'm not convinced that it is beyond repair. Many thanks to @m4dc4p and the other stewards of HaskellDB that have guided it through its life from its beginnings as an academic/student project. It is somewhat unfortunate to learn that this is a bug and not me being dumb.
I'll try my best to grok the code and see if I can submit a patch for this particular bug. I may bother you @m4dc4p some more for pointers as I go through the code. I anticipate it'll be quite a while before I understand enough to submit a meaningful patch though.
-- SELECT Price1 as Price,
-- Name2 as Name,
-- Name3 as Name -- This part is different!
I guess you meant Name3 as <SomeNewUniqueName>
Yep, that's right. I miswrote the comment. I've edited the original.
@changlinli I pushed a bug fix branch - still needs some refactoring, but (at least for me) it fixes the issue you described. Check it out: https://github.com/m4dc4p/haskelldb/tree/m4dc4p/25-duplicate-column-failure.
Notice I added some test cases in test/TestCases.hs that address the problem:
- testDupCol1
- testDupCol2
Apologies for the lateness of my reply; things got a bit busy on my end. Unfortunately, it looks like I'm still running into problems, I'll have some examples up when I get time today.