The Type Tetris Toolbox
This post assumes the reader is a semi-new Haskeller who understand type signatures, function application, typeclasses, and have struggled with getting large expressions to type check. All code samples assume -Wall
is turned on.
The Type Tetris Toolbox is a set of features and language extensions which allow subdividing expressions that I was having trouble getting to type check. It enables “debugging at the type level”. These features and language extensions are undefined
, holes, InstanceSigs
, and ScopedTypeVariables
.
Undefined “Undefined inhabits all types.” Which means it can be any type. In a statement like putStrLn undefined
, GHC will infer that undefined is a String. undefined
is useful for stubbing out expressions for which we aren’t sure just yet how to build a value for.
Of course trying to evaluate undefined
at runtime causes an exception. Fortunately, type tetris happens during compilation, well before any code would need to be executed. This allows us to use undefined
as a means to incrementally work our way through a problem without having to get everything right all at once. (Note: prefacing an expression with :t
in GHCi will just evaluate things at the type level, it won’t execute anything, thereby avoiding the bad time caused by attempting to evaluate undefined.)
The ability to subdivide a large problem and chip away at it is huge and undefined
gives us that. It is easily the most powerful and important tool in the Type Tetris Toolbox, right after GHC/GHCi itself.
Since undefined
can be any type, we can specify the type we think it should be and let GHC check our assumptions. :t purStrLn (undefined :: String)
doesn’t cause the compiler to complain but :t putStrLn (undefined :: Int)
does. Even better it will cause the compiler to complain that the expected type ([Char] from the second argument of putStrLn) doesn’t match up with the actual type (Int from undefined :: Int
).
Holes Checking the types needed by an expression is so common GHC has a special feature for doing things like putStrLn (undefined :: Int)
. That feature is holes, or the underscore character _
. putStrLn _
will cause GHC to give us an error, saying that it found a hole and what the expected type of the value in that position should be.
This is good for figuring out the types for intermediary target values in a more complicated expression, for something like putStrLn _
it’s a bit obvious and the _
doesn’t get you any more information, it just saves a few keystrokes. Where it really shines is in the midst of a stack of monads (aka, monad transformers) where there’s a few nested do
expressions and polymorphism in the expression means it’s not at all clear to a newer user what the compiler is expecting where, and compiler errors seem to have drifted away from where the code actually needs fixing.
Type Holes Holes can also be used in type definitions, here they’re called “type holes”, but they work the same. They don’t mix well with undefined
, as they work against actual values. However if we have a value and we want to find out the type of that value, we can simply (foo :: _)
and GHC will produce an error telling us what type the value has.
Personally, I don’t use this very often. It’s just rare that I know how to construct a value, but don’t know the type. Mostly I’ve only used it to add type annotations to values after the fact, to clean up warnings about top level values not having a type defined.
InstanceSigs There are times where specifying undefined values can be cumbersome and can even distract from the task at hand. The language extension InstanceSigs helps with this problem by allowing type definitions as part of larger expressions. An example should make that clearer:
{-# LANGUAGE InstanceSigs #-}
data MyMaybe a = MyJust a | MyNothing
instance Functor MyMaybe where
fmap :: (a -> b) -> MyMaybe a -> MyMaybe b
fmap _ MyNothing = MyNothing
fmap f (MyJust a) = MyJust $ f a
instance Applicative MyMaybe where
pure :: a -> MyMaybe a
pure = MyJust
(<*>) :: MyMaybe (a -> b) -> MyMaybe a -> MyMaybe b
(MyJust f) <*> myMaybeA = fmap f myMaybeA
MyNothing <*> _ = MyNothing
instance Monad MyMaybe where
(>>=) :: MyMaybe a -> (a -> MyMaybe b) -> MyMaybe b
(MyJust a) >>= f = f a
MyNothing >>= _ = MyNothing
At this point it just looks like a convenient way of typing out of the type definitions near where the implementation is done, and that’s correct for this example. What this has over simple comments is they’re actually checked by the compiler; if we wrote an incorrect type definition the compiler would tell us immediately.
InstanceSigs
really shine when used in conjunction with ScopedTypeVariables
.
ScopedTypeVariables ScopedTypeVariables
allow saying “whenever a type variable with this name shows up in a sub expression, I mean the same type as up here”. Yeah, that sounds weird because it’s changing some behavior that may not be intuitive. Here’s an example:
{-# LANGUAGE InstanceSigs #-}
{-# LANGUAGE ScopedTypeVariables #-}
newtype MaybeT f a =
MaybeT { runMaybeT :: f (Maybe a) }
instance forall m. Functor m => Functor (MaybeT m) where
fmap :: forall a b. (a -> b) -> MaybeT m a -> MaybeT m b
fmap f (MaybeT mMaybeA) =
let
a :: m (Maybe a) -> m (Maybe b)
a = ((fmap . fmap) f)
in MaybeT $ a mMaybeA
Now if we were to remove {-# LANGUAGE ScopedTypeVariables #-}
we’d get compiler errors spouting off about “Couldn't match type 'b' with 'b1'
” and “Couldn't match type 'a' with 'a1'
”. This is because GHC’s normal behavior is to consider the b’s in the type definition for the a value to be different b’s than in the type definition for fmap.
In order to get the a and b type variables to actually mean the same thing (and not just be two different things with the same name) we need to enable ScopedTypeVariables
and tag our scoped types with forall
. The rule of thumb when using ScopedTypeVariables
is to forall
all the new type variables that show up. So that’s why in the above code in the instance type information it’s forall m.
, m is the newly introduced type variable. Then in fmap
, forall a b.
, a and b are the newly introduced type variables. This isn’t strictly necessary, in fact the above example works after removing the forall m.
, it’s just more often than not just doing it for all the newly introduced type variables leads to less surprises.
Putting it all together Okay, so that’s all the tools. What follows is me working through a problem from the great HaskellBook.
Here’s the starting point, it’s creating the Functor
instance for a State
monad transformer.
newtype StateT s m a =
StateT { runStateT :: s -> m (a,s) }
instance Functor m => Functor (StateT s m) where
fmap f (StateT s) = undefined
And this compiles thanks to undefined
. There are some warnings that f and s aren’t used, and there’s an undefined hanging out there. The next obvious step is to write out the fmap type definition, so it’s time to turn on InstanceSigs.
{-# LANGUAGE InstanceSigs #-}
newtype StateT s m a =
StateT { runStateT :: s -> m (a,s) }
instance Functor m => Functor (StateT s m) where
fmap :: (a -> b) -> StateT s m a -> StateT s m b
fmap f (StateT s) = undefined
Still compiles, no surprises so far. That’s a pretty funky looking type signature. Now we have to think about our end goal. We have an f
, which is (a -> b)
, and we want to apply that to the a
in s -> m (a, s)
to get a s -> m (b, s)
. So really our ultimate goal is to get a s -> m (b,s)
which we can then wrap in a StateT
. So let’s just write that down in a way the compiler can check using let
and undefined
.
{-# LANGUAGE InstanceSigs #-}
newtype StateT s m a =
StateT { runStateT :: s -> m (a,s) }
instance Functor m => Functor (StateT s m) where
fmap :: (a -> b) -> StateT s m a -> StateT s m b
fmap f (StateT s) =
let goal :: (s -> m (b, s))
goal = undefined
in StateT goal
Great! So our first intuitive leap about our goal checks out with the compiler.
Since we have a s -> m(a, s)
value in s
, let’s rename that s
to something a little clearer so we don’t confuse it with a value for the type variable s
. I like to name these variables something close to their type signature just to make things clearer for myself.
{-# LANGUAGE InstanceSigs #-}
newtype StateT s m a =
StateT { runStateT :: s -> m (a,s) }
instance Functor m => Functor (StateT s m) where
fmap :: (a -> b) -> StateT s m a -> StateT s m b
fmap f (StateT sToMOfTupleAS) =
let goal :: (s -> m (b, s))
goal = undefined
in StateT goal
Yeah, that name is a little bit of a mouthful. We’ll clean up our mess later.
It looks like we can probably pull a \s ->
out of our goal. I think this because our goal takes an s parameter and so does sToMOfTupleAS :: s -> m(a,s)
. There’s really very little else we can do than take the s
value our goal will eventually get (when somebody applies runStateT
to the result of our fmap
) and apply it to sToMOfTupleAS
. That application should get us an m (a,s)
. That’s enough thinking for now, we can have the compiler check this idea out for us.
{-# LANGUAGE InstanceSigs #-}
newtype StateT s m a =
StateT { runStateT :: s -> m (a,s) }
instance Functor m => Functor (StateT s m) where
fmap :: (a -> b) -> StateT s m a -> StateT s m b
fmap f (StateT sToMOfTupleAS) = StateT $ \s ->
let goal :: m (b, s)
goal = undefined
mOfTupleAS :: m (a, s)
mOfTupleAS = sToMOfTupleAS s
in goal
Kaboom! Compiler errors. These look familiar though. “Couldn’t match type ‘s’ with ‘s1’”, where ‘s’ and ‘s1’ are rigid type variables. When we look at the lines the compiler is saying the rigid type is defined on we see it’s because we’re using the same name for what are potentially two different types. It’s time to turn ScopedTypeVariables
on.
{-# LANGUAGE InstanceSigs #-}
{-# LANGUAGE ScopedTypeVariables #-}
newtype StateT s m a =
StateT { runStateT :: s -> m (a,s) }
instance forall m s. Functor m => Functor (StateT s m) where
fmap :: forall a b. (a -> b) -> StateT s m a -> StateT s m b
fmap f (StateT sToMOfTupleAS) = StateT $ \s ->
let goal :: m (b, s)
goal = undefined
mOfTupleAS :: m (a, s)
mOfTupleAS = sToMOfTupleAS s
in goal
That clears up the compiler errors. We still have some warnings about unused values but that’s okay, we’re not done and the compiler is just reminding us that we probably want to use them (or explicitly say we don’t care about them).
Okay, we have an m (b, s)
, an m (a, s)
and a -> b
values. fmap
allows us to apply our a function within our m
values since we said m
must be a Functor
in the instance definition. It’s probably clear that we can’t just fmap f mOfTupleAS
, since f
is a -> b
and the value in mOfTupleAS
is (a ,s)
. So let’s use a hole to get our intermediary value.
{-# LANGUAGE InstanceSigs #-}
{-# LANGUAGE ScopedTypeVariables #-}
newtype StateT s m a =
StateT { runStateT :: s -> m (a,s) }
instance forall m s. Functor m => Functor (StateT s m) where
fmap :: forall a b. (a -> b) -> StateT s m a -> StateT s m b
fmap f (StateT sToMOfTupleAS) = StateT $ \s ->
let goal :: m (b, s)
goal = fmap _ mOfTupleAS
mOfTupleAS :: m (a, s)
mOfTupleAS = sToMOfTupleAS s
in goal
And GHC dutifully tells us our hole should be a value of type (a,s) -> (b,s)
. Really all we need is a version of f
that applies f
to the first member of a tuple. That’s just applyFToFirstMember (a, s) = (f a, s)
.
{-# LANGUAGE InstanceSigs #-}
{-# LANGUAGE ScopedTypeVariables #-}
newtype StateT s m a =
StateT { runStateT :: s -> m (a,s) }
instance forall m s. Functor m => Functor (StateT s m) where
fmap :: forall a b. (a -> b) -> StateT s m a -> StateT s m b
fmap f (StateT sToMOfTupleAS) = StateT $ \s ->
let goal :: m (b, s)
goal = fmap applyFToFirstMember mOfTupleAS
mOfTupleAS :: m (a, s)
mOfTupleAS = sToMOfTupleAS s
applyFToFirstMember :: (a,s) -> (b,s)
applyFToFirstMember (a,s) = (f a, s)
in goal
No compiler errors! No unused warnings! This is “probably” correct then, just need to run it. The types check out at this point.
It’s at this point that we should do what testing we need to and clean up until we’re comfortable. There’s a lot of noise in this definition, and we don’t really need the language extensions turned on for running this code.
newtype StateT s m a =
StateT { runStateT :: s -> m (a,s) }
instance Functor m => Functor (StateT s m) where
fmap f (StateT a) = StateT $ \s ->
let g (a',s') = (f a', s')
in fmap g $ a s
Now that that’s cleaned up it seems pretty simple. What’s nice about this approach is we don’t have to see the end implementation, we can incrementally walk up to it and make as big of leaps between steps as we feel comfortable with.
The only real downside is we have to know the types we want beforehand. So it’s great for homework problems where the types are provided and working in existing code bases where we may not be completely familiar with how the types fit together.