Fancy options parsing
We'd like to define a nicer interface for our program. While we could manage something
getArgs and pattern matching, it is easier to get good results using a library.
We are going to use a package called
optparse-applicative provides us with an EDSL (yes, another one) to build
command arguments parsers. Things like commands, switches, and flags can be built
and composed together to make a parser for command-line arguments without actually
writing operations on strings as we did when we wrote our Markup parser, and will
provide other benefits such as automatic generation of usage lines, help screens,
error reporting, and more.
optparse-applicative's dependency footprint isn't very large,
it is likely that a user of our library wouldn't need command-line parsing
in this particular case, so it makes sense to add this dependency to the
(rather than the
library section) in the
executable hs-blog-gen import: common-settings hs-source-dirs: app main-is: Main.hs build-depends: base + , optparse-applicative , hs-blog ghc-options: -O
Building a command-line parser
The optparse-applicative package has pretty decent documentation, but we will cover a few important things to pay attention to in this chapter.
In general, there are four important things we need to do:
Define our model - we want to define an ADT that describes the various options and commands for our program
Define a parser that will produce a value of our model type when run
Run the parser on our program arguments input
Pattern match on the model and call the right operations according to the options
Define a model
Let's envision our command-line interface for a second, what should it look like?
We want to be able to convert a single file or input stream to either a file or an output stream, or we want to process a whole directory and create a new directory. We can model it in an ADT like this:
data Options = ConvertSingle SingleInput SingleOutput | ConvertDir FilePath FilePath deriving Show data SingleInput = Stdin | InputFile FilePath deriving Show data SingleOutput = Stdout | OutputFile FilePath deriving Show
Note that we could technically also use
Maybe FilePathto encode both
SingleOutput, but then we would have to remember what
Nothingmeans in each context. By creating a new type with properly named constructors for each option we make it easier for readers of the code to understand the meaning of our code.
In terms of interface, we could decide that when a user would like to convert
a single input source, they would use the
convert command, and supply the optional flags
--input FILEPATH and
--output FILEPATH to read or write from a file.
When the user does not supply one or both flags, we will read or write from
the standard input/output accordingly.
If the user would like to convert a directory, they can use the
command and supply the two mandatory flags
--input FILEPATH and
Build a parser
This is the most interesting part of the process. How do we build a parser that fits our model?
optparse-applicative library introduces a new type called
Parser, similar to
IO, has the kind
* -> * - when it
is supplied with a saturated (or concrete) type such as
Options, it can become a saturated type (one that has values).
Parser a represents a specification of a command-line options parser
that produces a value of type
a when the command-line arguments are
This is similar to how
IO a represents a description of a program
that can produce a value of type
a. The main difference between these
two types is that while we can't convert an
IO a to an
(we just chain IO operations and have the Haskell runtime execute them),
we can convert a
Parser a to a function that takes a list of strings
representing the program arguments and produces an
a if it manages
to parse the arguments.
As we've seen with the previous EDSLs, this library uses the combinator pattern as well. We need to consider the basic primitives for building a parser, and the methods of composing small parsers into bigger parsers.
Let's see an example for a small parser:
inp :: Parser FilePath inp = strOption ( long "input" <> short 'i' <> metavar "FILE" <> help "Input file" ) out :: Parser FilePath out = strOption ( long "output" <> short 'o' <> metavar "FILE" <> help "Output file" )
strOption is a parser builder. It is a function that takes a combined
option modifiers as an argument, and returns a parser that will parse a string.
We can specify the type to be
FilePath is an
String. The parser builder describes how to parse the value,
and the modifiers describe its properties, such as the flag name,
the shorthand of the flag name, and how it would be described in the usage
and help messages.
strOptioncan return any string type that implements the interface
IsString. There are a few such types, for example
Text, a much more efficient Unicode text type from the
textpackage. It is more efficient than
Stringis implemented as a linked list of
Textis implemented as an array of bytes.
Textis usually what we should use for text values instead of
String. We haven't been using it up until now because it is slightly less ergonomic to use than
String. But it is often the preferred type to use for text!
As you can see, modifiers can be composed using the
which means modifiers implement an instance of the
Semigroup type class!
With such an interface we don't have to supply all the modifier options, but only the relevant ones. So if we don't want to have a shortened flag name, we don't have to add it.
For the data type we've defined, having
Parser FilePath takes us
a good step in the right direction, but it is not exactly what we need
ConvertSingle. We need a
Parser SingleInput and a
Parser SingleOutput. If we had a
FilePath, we could convert
SingleInput by using the
InputFile is also a function:
InputFile :: FilePath -> SingleInput OutputFile :: FilePath -> SingleOutput
However, to convert a parser, we need functions with these types:
f :: Parser FilePath -> Parser SingleInput g :: Parser FilePath -> Parser SingleOutput
Parser interface provides us with a function to "lift"
a function like
FilePath -> SingleInput to work on parsers, making
it a function with the type
Parser FilePath -> Parser SingleInput.
Of course, this function will work for any input and output,
so if we have a function with the type
a -> b, we can pass it to
that function and get a new function of the type
Parser a -> Parser b.
This function is called
fmap :: (a -> b) -> Parser a -> Parser b -- Or with its infix version (<$>) :: (a -> b) -> Parser a -> Parser b
fmap before in the interface of other types:
fmap :: (a -> b) -> [a] -> [b] fmap :: (a -> b) -> IO a -> IO b
fmap is a type class function like
show. It belongs
to the type class
class Functor f where fmap :: (a -> b) -> f a -> f b
And it has the following laws:
-- 1. Identity law: -- if we don't change the values, nothing should change fmap id = id -- 2. Composition law: -- Composing the lifted functions is the same a composing -- them after fmap fmap (f . g) == fmap f . fmap g
f that can implement
fmap and follow these laws can be a valid
fhas a kind
* -> *, we can infer the kind of
fby looking at the other types in the type signature of
bhave the kind
*because they are used as arguments/return types of functions
f ahas the kind
*because it is used as an argument to a function, therefore
fhas the kind
* -> *
Let's choose a data type and see if we can implement a
We need to choose a data type that has the kind
* -> *.
Maybe fits the bill.
We need to implement a function
fmap :: (a -> b) -> Maybe a -> Maybe b.
Here's one very simple (and wrong) implementation:
mapMaybe :: (a -> b) -> Maybe a -> Maybe b mapMaybe func maybeX = Nothing
Check it yourself! It compiles successfully! But unfortunately it does not
satisfy the first law.
fmap id = id means that
mapMaybe id (Just x) == Just x, however from the definition we can
clearly see that
mapMaybe id (Just x) == Nothing.
This is a good example of how Haskell doesn't help us make sure the laws
are satisfied, and why they are important. Unlawful
will behave differently from what we'd expect a
Functor to behave.
Let's try again!
mapMaybe :: (a -> b) -> Maybe a -> Maybe b mapMaybe func maybeX = case maybeX of Nothing -> Nothing Just x -> Just (func x)
mapMaybe will satisfy the functor laws. This can be proved
by doing algebra - if we can do substitution and reach the other side of the
equation in each law, then the law holds.
Functor is a very important type class, and many types implement this interface.
As we know,
Parser all have the kind
* -> *,
and all allows us to map over their "payload" type.
Often people try to look for analogies and metaphors to what a type class mean, but type classes with funny names like
Functordon't usually have an analogy or a metaphor that fits them in all cases. It is easier to give up on the metaphor and think about it as it is - an interface with laws.
We can use
Parser to make a parser that returns
pInputFile :: Parser SingleInput pInputFile = fmap InputFile parser where parser = strOption ( long "input" <> short 'i' <> metavar "FILE" <> help "Input file" ) pOutputFile :: Parser SingleOutput pOutputFile = OutputFile <$> parser -- fmap and <$> are the same where parser = strOption ( long "output" <> short 'o' <> metavar "FILE" <> help "Output file" )
Now that we have two parsers,
pInputFile :: Parser SingleInput
pOutputFile :: Parser SingleOutput,
we want to combine them as
Options. Again, if we only had
SingleOutput, we could use the constructor
ConvertSingle :: SingleInput -> SingleOutput -> Options
Can we do a similar trick to the one we saw before with
Does a function exist that can lift a binary function to work
Parsers instead? One with this type signature:
??? :: (SingleInput -> SingleOutput -> Options) -> (Parser SingleInput -> Parser SingleOutput -> Parser Options)
Yes. This function is called
liftA2 and it is from the
Applicative (also known as applicative functor) has three
class Functor f => Applicative f where pure :: a -> f a liftA2 :: (a -> b -> c) -> f a -> f b -> f c (<*>) :: f (a -> b) -> f a -> f b
is another very popular type class with many instances.
Just like any
Monoid is a
Functor. This means that any type that wants to implement
Applicative interface should also implement the
Beyond what a regular functor can do, which is to lift a function over
f, applicative functors allow us to apply a function to
multiple instances of a certain
f, as well as "lift" any value of type
a into an
You should already be familiar with
pure, we've seen it when we
pure lets us create an
with a specific return value without doing IO.
Parser, we can create a
Parser that when run
will return a specific value as output without doing any parsing.
<*> are two functions that can be implemented in
terms of one another.
<*> is actually the more useful one between
the two. Because when combined with
fmap (or rather the infix version
it can be used to apply a function with many arguments, instead of just two.
To combine our two parsers to one, we can use either
a combination of
-- with liftA2 pConvertSingle :: Parser Options pConvertSingle = liftA2 ConvertSingle pInputFile pOutputFile -- with <$> and <*> pConvertSingle :: Parser Options pConvertSingle = ConvertSingle <$> pInputFile <*> pOutputFile
Note that both
<*> associate to the left,
so we have invisible parenthesis that look like this:
pConvertSingle :: Parser Options pConvertSingle = (ConvertSingle <$> pInputFile) <*> pOutputFile
Let's take a deeper look at the types of the sub-expressions we have here, to prove that this type-checks:
pConvertSingle :: Parser Options pInputFile :: Parser SingleInput pOutputFile :: Parser SingleOutput ConvertSingle :: SingleInput -> SingleOutput -> Options (<$>) :: (a -> b) -> Parser a -> Parser b -- Specifically, here `a` is `SingleInput` -- and `b` is `SingleOutput -> Options`, ConvertSingle <$> pInputFile :: Parser (SingleOutput -> Options) (<*>) :: Parser (a -> b) -> Parser a -> Parser b -- Specifically, here `a -> b` is `SingleOutput -> Options` -- so `a` is `SingleOutput` and `b` is `Options` -- So we get: (ConvertSingle <$> pInputFile) <*> pOutputFile :: Parser Options
<*> we can chain as many parsers (or any applicative really)
as we want. This is because of two things: currying and parametric polymorphism.
Because functions in Haskell take exactly one argument and return exactly one,
any multiple argument function can be represented as
a -> b.
You can find the laws for the applicative functors in this article called Typeclassopedia, which talks about various useful type classes and their laws.
Applicative functor is a very important concept and will appear in various
parser interfaces (not just for command-line arguments, but also JSON
parsers and general parsers), I/O, concurrency, non-determinism, and more.
The reason this library is called optparse-applicative is because
it uses the
Applicative interface as the main API for
Exercise: create a similar interface for the
ConvertDir constructor of
pInputDir :: Parser FilePath pInputDir = strOption ( long "input" <> short 'i' <> metavar "DIRECTORY" <> help "Input directory" ) pOutputDir :: Parser FilePath pOutputDir = strOption ( long "output" <> short 'o' <> metavar "DIRECTORY" <> help "Output directory" ) pConvertDir :: Parser Options pConvertDir = ConvertDir <$> pInputDir <*> pOutputDir
One thing we forgot about is that each input and output for
ConvertSingle could also potentially use the standard input and output instead.
Up until now we only offered one option: reading from or writing to a file
by specifying the flags
However, we'd like to make these flags optional, and when they are
not specified, use the alternative standard i/o. We can do that by using
optional :: Alternative f => f a -> f (Maybe a)
optional works on types which implement instances of the
Alternative type class:
class Applicative f => Alternative f where (<|>) :: f a -> f a -> f a empty :: f a
Alternative looks very similar to the
Monoid type class,
but it works on applicative functors. This type class isn't
very common and is mostly used for parsing libraries as far as I know.
It provides us with an interface to combine two
if the first one fails to parse, try the other.
It also provides other useful functions such as
which will help us with our case:
pSingleInput :: Parser SingleInput pSingleInput = fromMaybe Stdin <$> optional pInputFile pSingleOutput :: Parser SingleOutput pSingleOutput = fromMaybe Stdout <$> optional pOutputFile
Note that with
fromMaybe :: a -> Maybe a -> a we can extract
a out of the
Maybe by supplying a value for the
Now we can use these more appropriate functions in
pConvertSingle :: Parser Options pConvertSingle = ConvertSingle <$> pSingleInput <*> pSingleOutput
Commands and subparsers
We currently have two possible operations in our interface,
convert a single source, or convert a directory. A nice interface for
selecting the right operation would be via commands.
If the user would like to convert a single source, they can use
convert, for a directory,
We can create a parser with commands with the
subparser :: Mod CommandFields a -> Parser a command :: String -> ParserInfo a -> Mod CommandFields a
subparser takes command modifiers (which can be constructed
command function) as input, and produces a
command takes the command name (in our case "convert" or "convert-dir")
ParserInfo a, and produces a command modifier. As we've seen
before these modifiers have a
Monoid instance and they can be
composed, meaning that we can append multiple commands to serve as alternatives.
ParserInfo a can be constructed with the
info :: Parser a -> InfoMod a -> ParserInfo a
This function wraps a
Parser with some additional information
such as a helper message, description, and more, so that the program
itself and each sub command can print some additional information.
Let's see how to construct a
pConvertSingleInfo :: ParserInfo Options pConvertSingleInfo = info (helper <*> pConvertSingle) (progDesc "Convert a single markup source to html")
helper adds a helper output screen in case the parser fails.
Let's also build a command:
pConvertSingleCommand :: Mod CommandFields Options pConvertSingleCommand = command "convert" pConvertSingleInfo
Try creating a
Parser Options combining the two options with
pOptions :: Parser Options pOptions = subparser ( command "convert" ( info (helper <*> pConvertSingle) (progDesc "Convert a single markup source to html") ) <> command "convert-dir" ( info (helper <*> pConvertDir) (progDesc "Convert a directory of markup files to html") ) )
Since we finished building a parser, we should wrap it up in a
and add some information to it to make it ready to run:
opts :: ParserInfo Options opts = info (helper <*> pOptions) ( fullDesc <> header "hs-blog-gen - a static blog generator" <> progDesc "Convert markup files or directories to html" )
Running a parser
optparse-applicative provides a non-
IO interface to parse arguments,
but the most convenient way to use it is to let it take care of fetching
program arguments, try to parse them, and throw errors and help messages in case
it fails. This can be done with the function
execParser :: ParserInfo a -> IO a.
We can place all this options parsing stuff in a new module
and then import it from
app/Main.hs. Let's do that.
Here's what we have up until now:
-- | Command-line options parsing module OptParse ( Options(..) , SingleInput(..) , SingleOutput(..) , parse ) where import Data.Maybe (fromMaybe) import Options.Applicative ------------------------------------------------ -- * Our command-line options model -- | Model data Options = ConvertSingle SingleInput SingleOutput | ConvertDir FilePath FilePath deriving Show -- | A single input source data SingleInput = Stdin | InputFile FilePath deriving Show -- | A single output sink data SingleOutput = Stdout | OutputFile FilePath deriving Show ------------------------------------------------ -- * Parser -- | Parse command-line options parse :: IO Options parse = execParser opts opts :: ParserInfo Options opts = info (pOptions <**> helper) ( fullDesc <> header "hs-blog-gen - a static blog generator" <> progDesc "Convert markup files or directories to html" ) -- | Parser for all options pOptions :: Parser Options pOptions = subparser ( command "convert" ( info (helper <*> pConvertSingle) (progDesc "Convert a single markup source to html") ) <> command "convert-dir" ( info (helper <*> pConvertDir) (progDesc "Convert a directory of markup files to html") ) ) ------------------------------------------------ -- * Single source to sink conversion parser -- | Parser for single source to sink option pConvertSingle :: Parser Options pConvertSingle = ConvertSingle <$> pSingleInput <*> pSingleOutput -- | Parser for single input source pSingleInput :: Parser SingleInput pSingleInput = fromMaybe Stdin <$> optional pInputFile -- | Parser for single output sink pSingleOutput :: Parser SingleOutput pSingleOutput = fromMaybe Stdout <$> optional pOutputFile -- | Input file parser pInputFile :: Parser SingleInput pInputFile = fmap InputFile parser where parser = strOption ( long "input" <> short 'i' <> metavar "FILE" <> help "Input file" ) -- | Output file parser pOutputFile :: Parser SingleOutput pOutputFile = OutputFile <$> parser where parser = strOption ( long "output" <> short 'o' <> metavar "FILE" <> help "Output file" ) ------------------------------------------------ -- * Directory conversion parser pConvertDir :: Parser Options pConvertDir = ConvertDir <$> pInputDir <*> pOutputDir -- | Parser for input directory pInputDir :: Parser FilePath pInputDir = strOption ( long "input" <> short 'i' <> metavar "DIRECTORY" <> help "Input directory" ) -- | Parser for output directory pOutputDir :: Parser FilePath pOutputDir = strOption ( long "output" <> short 'o' <> metavar "DIRECTORY" <> help "Output directory" )
Pattern matching on Options
After running the command-line arguments parser, we can pattern match
on our model and call the right functions. Currently, our program
does not expose this kind of API. So let's go to our
module and change the API. We can delete
main from that file and
add two new functions instead:
convertSingle :: Html.Title -> Handle -> Handle -> IO () convertDirectory :: FilePath -> FilePath -> IO ()
is an I/O abstraction over file system objects, including
Before, we used
getContents - these functions either
FilePath to open and work on, or they assume the
Handle is the standard I/O.
We can use the explicit versions that take a
convertSingle :: Html.Title -> Handle -> Handle -> IO () convertSingle title input output = do content <- hGetContents input hPutStrLn output (process title content)
We will leave
convertDirectory unimplemented for now and implement it in the next chapter.
app/Main.hs, we will need to pattern match on the
prepare to call the right functions from
Let's look at our full
-- | Entry point for the hs-blog-gen program module Main where import OptParse import qualified HsBlog import System.Exit (exitFailure) import System.Directory (doesFileExist) import System.IO main :: IO () main = do options <- parse case options of ConvertDir input output -> HsBlog.convertDirectory input output ConvertSingle input output -> do (title, inputHandle) <- case input of Stdin -> pure ("", stdin) InputFile file -> (,) file <$> openFile file ReadMode outputHandle <- case output of Stdout -> pure stdout OutputFile file -> do exists <- doesFileExist file shouldOpenFile <- if exists then confirm else pure True if shouldOpenFile then openFile file WriteMode else exitFailure HsBlog.convertSingle title inputHandle outputHandle hClose inputHandle hClose outputHandle ------------------------------------------------ -- * Utilities -- | Confirm user action confirm :: IO Bool confirm = putStrLn "Are you sure? (y/n)" *> getLine >>= \answer -> case answer of "y" -> pure True "n" -> pure False _ -> putStrLn "Invalid response. use y or n" *> confirm
-- HsBlog.hs module HsBlog ( convertSingle , convertDirectory , process ) where import qualified HsBlog.Markup as Markup import qualified HsBlog.Html as Html import HsBlog.Convert (convert) import System.IO convertSingle :: Html.Title -> Handle -> Handle -> IO () convertSingle title input output = do content <- hGetContents input hPutStrLn output (process title content) convertDirectory :: FilePath -> FilePath -> IO () convertDirectory = error "Not implemented" process :: Html.Title -> String -> String process title = Html.render . convert title . Markup.parse
We need to make a few small changes to the
First, we need to add the dependency
directory to the
because we use the library
Second, we need to list
OptParse in the list of modules in
executable hs-blog-gen import: common-settings hs-source-dirs: app main-is: Main.hs + other-modules: + OptParse build-depends: base + , directory , optparse-applicative , hs-blog ghc-options: -O
We've learned about a new fancy library called
and used it to create a fancier command-line interface in a declarative way.
See the result of running
hs-blog-gen --help (or the equivalent
stack commands we discussed in the last chapter):
hs-blog-gen - a static blog generator Usage: hs-blog-gen COMMAND Convert markup files or directories to html Available options: -h,--help Show this help text Available commands: convert Convert a single markup source to html convert-dir Convert a directory of markup files to html
Along the way we've learned two powerful new abstractions,
Applicative, as well as revisited an abstraction
Monoid. With this library we've seen another example
of the usefulness of these abstractions for constructing APIs and EDSLs.
We will continue to meet these abstractions in the rest of the book.
Bonus exercise: Add another flag named
--replace to indicate that
if the output file or directory already exists, it's okay to replace them.
You can view the git commit of the changes we've made and the code up until now.