There are quite a few ways to indicate and handle errors in Haskell. We are going to look at one solution: using the type Either. Either is defined like this:
data Either a b = Left a | Right b
Simply put, a value of type
Either a b can contain either a value of type
or a value of type
We can tell them apart from the constructor used.
Left True :: Either Bool b Right 'a' :: Either a Char
Using this type, we can represent computations that may fail by using the
Left constructor to indicate failure with some error value attached,
Right constructor with one type to represent success with the
Either is polymorphic, we can use any two types to represent
failure and success. It is often useful to describe the failure modes
using an ADT.
For example, let's say that we want to parse a
Char as a decimal digit
Int. This operation could fail if the Character is not a digit.
We can represent this error as a data type:
data ParseDigitError = NotADigit Char deriving Show
And our parsing function can have the type:
parseDigit :: Char -> Either ParseDigitError Int
Now when we implement our parsing function we can return
Left on an error
describing the problem, and
Right with the parsed value on successful parsing:
parseDigit :: Char -> Either ParseDigitError Int parseDigit c = case c of '0' -> Right 0 '1' -> Right 1 '2' -> Right 2 '3' -> Right 3 '4' -> Right 4 '5' -> Right 5 '6' -> Right 6 '7' -> Right 7 '8' -> Right 8 '9' -> Right 9 _ -> Left (NotADigit c)
Either a is also an instance of
so we have some combinators to work with if we want to combine these
kind of computations.
For example, if we had three characters and we wanted to try and parse each of them and then find the maximum between them, we could use the applicative interface:
max3chars :: Char -> Char -> Char -> Either ParseDigitError Int max3chars x y z = (\a b c -> max a (max b c)) <$> parseDigit x <*> parseDigit y <*> parseDigit z
Applicative interfaces of
Either a allow us to
apply functions to the payload values and delay the error handling to a
later phase. Semantically, the first Either in order that returns a
will be the return value. We can see how this works in the implementation
of the applicative instance:
instance Applicative (Either e) where pure = Right Left e <*> _ = Left e Right f <*> r = fmap f r
At some point, someone will actually want to inspect the result
and see if we got an error (with the
Left constructor) or the expected value
Right constructor) and they can do that by pattern matching on the result.
Applicative interface of
Either is very powerful, and can be combined
with another abstraction called
for data structures that can be traversed from left to right, like a linked list or a binary tree.
With these, we can combine an unspecified amount of values such as
Either ParseDigitError Int,
as long as they are all in a data structure that implements
Let's see an example:
ghci> :t "1234567" "1234567" :: String -- remember, a String is an alias for a list of Char ghci> :info String type String :: * type String = [Char] -- Defined in ‘GHC.Base’ ghci> :t map parseDigit "1234567" map parseDigit mystring :: [Either ParseDigitError Int] ghci> map parseDigit "1234567" [Right 1,Right 2,Right 3,Right 4,Right 5,Right 6,Right 7] ghci> :t sequenceA sequenceA :: (Traversable t, Applicative f) => t (f a) -> f (t a) -- Substitute `t` with ``, and `f` with `Either Error` for a specialized version ghci> sequenceA (map parseDigit mystring) Right [1,2,3,4,5,6,7] ghci> map parseDigit "1a2" [Right 1,Left (NotADigit 'a'),Right 2] ghci> sequenceA (map parseDigit "1a2") Left (NotADigit 'a')
The pattern of doing
map and then
sequenceA is another function called
ghci> :t traverse traverse :: (Traversable t, Applicative f) => (a -> f b) -> t a -> f (t b) ghci> traverse parseDigit "1234567" Right [1,2,3,4,5,6,7] ghci> traverse parseDigit "1a2" Left (NotADigit 'a')
We can use
traverse on any two types where one implements the
Either a or
IO, and the other implements the
 (linked lists) and
(also known as a dictionary in other languages - a mapping from keys to values).
For example using
Map. Note that we can construct a
Map data structure
from a list of tuples using the
function - the first value in the tuple is the key, and the second is the type.
ghci> import qualified Data.Map as M -- from the containers package ghci> file1 = ("output/file1.html", "input/file1.txt") ghci> file2 = ("output/file2.html", "input/file2.txt") ghci> file3 = ("output/file3.html", "input/file3.txt") ghci> files = M.fromList [file1, file2, file3] ghci> :t files :: M.Map FilePath FilePath -- FilePath is an alias of String files :: M.Map FilePath FilePath :: M.Map FilePath FilePath ghci> readFiles = traverse readFile ghci> :t readFiles readFiles :: Traversable t => t FilePath -> IO (t String) ghci> readFiles files fromList [("output/file1.html","I'm the content of file1.txt\n"),("output/file2.html","I'm the content of file2.txt\n"),("output/file3.html","I'm the content of file3.txt\n")] ghci> :t readFiles files readFiles files :: IO (Map String String)
Above, we created a function
readFiles that will take a mapping from output file path
to input file path and returns an IO operation that when run will read the input files
and replace their contents right there in the map! Surely this will be useful later.
Either has the kind
* -> * -> * (it takes two type
Either cannot be an instance of
instances for these type classes can only be implemented for types that have the
* -> *.
Remember that when we look at a type class function signature like:
fmap :: Functor f => (a -> b) -> f a -> f b
And we want to implement it for a specific type (in place of the
we need to be able to substitute the
f with the target type. If we'd try
to do it with
Either we'll get:
fmap :: (a -> b) -> Either a -> Either b
Either a or
Either b are saturated, so this won't type check.
For the same reason if we'll try to substitute
f with, say,
Int, we'll get:
fmap :: (a -> b) -> Int a -> Int b
Which also doesn't make sense.
So while we can't use
Either, we can use
Either e, which has the kind
* -> *. Now let's try substituting
Either e in this signature:
liftA2 :: Applicative => (a -> b -> c) -> f a -> f b -> f c
And we'll get:
liftA2 :: (a -> b -> c) -> Either e a -> Either e b -> Either e c
What this teaches us is that we can only use the applicative interface to
Eithers with the same type for the
So what can we do if we have two functions that can return different errors? There are a few approaches, the most prominent ones are:
- Make them return the same error type. Write an ADT that holds all possible
error descriptions. This can work in some cases but isn't always ideal
because for example a user calling
parseDigitshouldn't be forced to handle a possible case that the input might be an empty string.
- Use a specialized error type for each type, and when they are composed together,
map the error type of each function to a more general error type. This can
be done with the function
The applicative interface allows us to lift a function on to work on multiple
Either values (or other applicative functor instances such as
But more often than not, we'd like to be able to use a value from one computation
that might return an error in another computation that might return an error.
For example, a compiler such has GHC operates in stages, such as lexical analysis, parsing, type-checking, and so on. Each stage depends on the output of the stage before it, and each stage might fail. We can write the types for these functions:
tokenize :: String -> Either Error [Token] parse :: [Token] -> Either Error AST typcheck :: AST -> Either Error TypedAST
We want to compose these functions so that they work in a chain. The output of
parse, the output of
parse goes into to
We know that we can lift a function over an
Either (and other functors),
we could also lift a function that returns an
-- reminder the type of fmap fmap :: Functor f => (a -> b) -> f a -> f b -- specialized for `Either Error` fmap :: (a -> b) -> Either Error a -> Either Error b -- here, `a` is [Token] and `b` is `Either Error AST`: > fmap parse (tokenize string) :: Either Error (Either Error AST)
While this code compiles, it isn't great, because we are building
Either Error and we can't use this trick again with
typecheck expects an
AST, but if we try to fmap it
fmap parse (tokenize string), the
a will be
Either Error AST
What we would really like is to flatten this structure instead of building it.
If we look at the kind of values
Either Error (Either Error AST) could have,
it looks something like this:
Right (Left error)
Right (Right <ast>)
Exercise: What if we just used pattern matching for this instead? How would this look like?
case tokenize string of Left err -> Left err Right tokens -> case parse tokens of Left err -> Left err Right ast -> typecheck ast
If we run into an error in a stage, we return that error and stop. If we succeed, we use the value on the next stage.
Flattening this structure for
Either is very similar to that last part - the body
Right tokens case:
flatten :: Either e (Either e a) -> Either e a flatten e = case e of Left l -> Left l Right x -> x
Because we have this function, we can now use it on the output of
fmap parse (tokenize string) :: Either Error (Either Error AST)
> flatten (fmap parse (tokenize string)) :: Either Error AST
And now we can use this function again to compose with
> flatten (fmap typecheck (flatten (fmap parse (tokenize string)))) :: Either Error TypedAST
fmap combination looks like a recurring pattern which
we can combine into a function:
flatMap :: (a -> Either e b) -> Either a -> Either b flatMap func val = flatten (fmap func val)
And now we can write the code this way:
> flatMap typecheck (flatMap parse (tokenize string)) :: Either Error TypedAST -- Or using backticks syntax to convert the function to infix form: > typecheck `flatMap` parse `flatMap` tokenize string -- Or create a custom infix operator: (=<<) = flatMap > typeCheck =<< parse =<< tokenize string
flatMap as well), have different names in Haskell.
They are called
(pronounced "reverse bind"),
and they are the essence of another incredibly useful abstraction in Haskell.
If we have a type that can implement:
Functorinterface, specifically the
Applicativeinterface, most importantly the
They can implement an instance of the
Monad type class.
With functors, we were able to "lift" a function to work over the type implementing the functor type class:
fmap :: (a -> b) -> f a -> f b
With applicative functors we were able to "lift" a function of multiple arguments over multiple values of a type implementing the applicative functor type class, and also lift a value into that type:
pure :: a -> f a liftA2 :: (a -> b -> c) -> f a -> f b -> f c
With monads we can now flatten (or, "join" in Haskell terminology) types that implement
join :: m (m a) -> m a -- this is =<< with the arguments reversed, pronounced "bind" (>>=) :: m a -> (a -> m b) -> m b
>>= we can write our compilation pipeline from before in a left-to-right
manner, which seems to be more popular for monads:
> tokenize string >>= parse >>= typecheck
We have already met this function before when we talked about
IO also implements the
Monad interface. The monadic interface for
helped us with creating a proper ordering of effects.
The essence of the
Monad interface is the
>>= functions, and as we've seen
we can implement
>>= in terms of
join, we can also implement
join in terms
>>= (try it!).
The monadic interface can mean very different things for different types. For
is ordering of effects, for
Either it is early cutoff,
Logic this means backtracking computation, etc.
Again, don't worry about analogies and metaphors, focus on the API and the laws.
Hey, did you check the monad laws? left identity, right identity and associativity? We've already discussed a type class with exactly these laws - the
Monoidtype class. Maybe this is related to the famous quote about monads beings just monoids in something something...
Remember do notation? Turns out it works for any type that is
an instance of
Monad. How cool is that? Instead of writing:
pipeline :: String -> Either Error TypedAST pipeline string = tokenize string >>= \tokens -> parse tokens >>= \ast -> typecheck ast
We can write:
pipeline :: String -> Either Error TypedAST pipeline string = do tokens <- tokenize string ast <- parse tokens typecheck ast
And it will work! Still, in this particular case
tokenize string >>= parse >>= typecheck
is so concise it can only be beaten by using
>=> :: Monad m => (a -> m b) -> (b -> m c) -> a -> m c -- compare with function composition: (.) :: (a -> b) -> (b -> c) -> a -> c
pipeline = tokenize >=> parse >=> typecheck
This ability of Haskell's to create very concise code using great abstractions makes it great once one is familiar with the abstractions. Knowing the monad abstraction, we are now already familiar with the core composition API of many libraries - for example:
- Concurrent and asynchronous programming
- Web programming
- Emulating stateful computation
- sharing environment between computations
- and many more.
Either for error handling is useful for two reasons:
- We encode possible errors using types, and we force users to acknowledge and handle them, thus making our code more resilient to crashes and bad behaviour.
Monadinterfaces provide us with mechanisms for composing functions that might fail (almost) effortlessly - reducing boilerplate while maintaining strong guarantees about our code, and delaying the need to handle errors until it is appropriate.