bifunctors
bifunctors copied to clipboard
Kind mismatch in derived instances
data Bar a b where
Bar :: b -> Bar True b
data Foo b c where
Foo :: Bar x c -> Foo b c
deriveBifunctor ''Foo
• Couldn't match kind ‘Bool’ with ‘*’
When matching types
p0 :: * -> * -> *
Bar :: Bool -> * -> *
Expected: Bar x d
Actual: p0 b0 d
• In the first argument of ‘Foo’, namely
‘(((bimap (\ _n_a7u1C -> _n_a7u1C)) g_a7u1v) _arg1_a7u1B)’
In the expression:
Foo (((bimap (\ _n_a7u1C -> _n_a7u1C)) g_a7u1v) _arg1_a7u1B)
In a case alternative:
Foo _arg1_a7u1B
-> Foo (((bimap (\ _n_a7u1C -> _n_a7u1C)) g_a7u1v) _arg1_a7u1B)
|
419 | deriveBifunctor ''Foo
| ^^^^^^^^^^^^^^^^^^^^^
The derived instance shouldn't be using bimap
here but rather just fmap
Hm, this is a tricky one. Before reviewing #125 too closely (thank you for submitting a patch, by the way!), I think it would be worth thinking about what the specification for deriveBifunctor
and friends should be, as #125 changes the behavior of these functions in a subtle-but-significant way.
While I haven't written up exactly how I intend deriveBifunctor
et al. to behave in full detail, this part of the Haddocks does a pretty good job of explaining what I am going for:
With a
derive
function, the last two type variables must both be of kind*
. Other type variables of kind* -> *
are assumed to require aFunctor
,Foldable
, orTraversable
constraint (depending on whichderive
function is used), and other type variables of kind* -> * -> *
are assumed to require anBifunctor
,Bifoldable
, orBitraversable
constraint.
This criterion gives you a relatively simple way to tell what sort of code deriveBifunctor
will derive. If you see a field that looks like T a
, then it will be fmap
ped, and if you see a field that looks like T a b
, then it will be bimap
ped.
One drawback of this criterion is that despite being simple, it is not clever enough to come up with Bifunctor
instances for all possible data types. Here is a counterexample:
data Baz a b where
Baz :: a -> Baz a True
data Quux b c where
Quux :: Baz c x -> Quux b c
deriveBifunctor ''Quux
When deriveBifunctor ''Quux
sees the Baz c x
field, it will call bimap
on it. This has no hope of working, however, as Baz :: Type -> Bool -> Type
has the wrong kind for Bifunctor
. On the other hand, it is still possible to define a Bifunctor
instance for Quux
if you use a much different approach:
mapBaz :: (a -> a') -> Baz a b -> Baz a' b
mapBaz f (Baz x) = Baz (f x)
instance Bifunctor Quux where
bimap _ g (Quux x) = Quux (mapBaz g x)
This works, but it requires much more cleverness that the criterion described above, as it requires knowledge about the definition of Baz
to implement. In the spirit of keeping deriveBifunctor
dumb but predictable, I have opted not to do anything that requires cleverness of this sort.
The question now is: does the change implemented in #125 rise to this level of cleverness? The example in https://github.com/ekmett/bifunctors/issues/124#issue-1718366377 is rather similar to the example above, as Bar
/Foo
are slight variations of Baz
/Quux
. On the other hand, implementing a Bifunctor
instance for Foo
doesn't require knowing anything about the definition of Bar
to implement—it just requires being careful about counting type variable occurrences. (For instance, if you wrote a Bar b c
field instead of a Bar x c
field, then it would be bimap
ped instead of fmap
ped!)
This one feels right on the borderline of cleverness for me. I need to give this one some thought.
I realized right after submitting my last comment that my argument isn't entirely consistent with how deriveBifunctor
currently works. I was worried that it would be too confusing for users to figure out how their fields would be mapped depending on how many type variable occurrences there are, but in fact, that is already happening to some degree. Consider this example:
data T a b = MkT a b (M Int) (E Bool Char)
deriveBifunctor ''T
Based on the criterion I mention in https://github.com/ekmett/bifunctors/issues/124#issuecomment-1556157583, you might expect the M Int
field to be fmap
ped and the E Bool Char
field to be bimap
ped. But this is not what happens! Instead, this Bifunctor
instance would be generated:
instance Bifunctor T where
bimap f g (MkT x1 x2 x3 x4) = MkT (f x1) (g x2) x3 x4
Because neither field mentions any of the type parameters, they do not get mapped at all. This is a very useful optimization, as we can avoid unnecessary allocations by avoiding redundant fmap
s and bimap
s. But more importantly, this means that this instance will typecheck even if M
does not have a Functor
instance or E
does not have a Bifunctor
instance.
This is all to say: the algorithm that deriveBifunctor
uses is already quite sensitive to how type variables occur in each field, and users must be aware of this in order to know exactly which fields will be mapped in which ways. For this reason, the change in #125 is consistent with existing precedent. We should still make sure to document all of this somewhere, as it's rather subtle.