In this section we'll learn how to create our own distinguished types for HTML, and how they can help us avoid invalid construction of HTML strings.
There are a few ways of defining new types in Haskell, in this section
we are going to meet two ways:
newtype lets us give a new name to an already existing type in a
way that the two cannot mix together.
newtype declaration looks like this:
newtype <type-name> = <constructor> <existing-type>
For example in our case we can define a distinct type for
Html like this:
newtype Html = Html String
Html, to the left of the equals sign, lives in the types
name space, meaning that you will only see that name to the right of a
double-colon sign (
Html lives in the expressions (or terms/values) name space,
meaning that you will see it where you expect expressions (we'll touch where
exactly that can be in a moment).
The two names,
<constructor>, do not have to be the
same, but they often are. And note that both have to start with a
The right-hand side of the newtype declaration describes the shape of a
value of that type. In our case, we expect a value of
Html to have the constructor
Html and then an expression of
type string, for example:
Html "hello" or
Html ("hello " <> "world").
You can think of the constructor as a function that takes the argument and returns something of our new type:
Html :: String -> Html
Note: We cannot use an expression of type
Html the same way we'd
"hello " <> Html "world" would fail at type
This is useful when we want encapsulation. We can define and use existing representation and functions for our underlying type, but not mix them with other, unrelated (to our domain) types. Similar as meters and feet can both be numbers, but we don't want to accidentally add feet to meters without any conversion.
For now, let's create a couple of types for our use case. We want two separate types to represent:
- A complete Html document
- A type for html structures such as headers and paragraphs that can go inside the tag
We want them to be distinct because we don't want to mix them together.
newtype Html = Html String newtype Structure = Structure String
In order to use the underlying type that the newtype wraps, we first need to extract it out of the type. We do this using pattern matching.
Pattern matching can be used in two ways, in case expressions and in function definitions.
case expressions are kind of beefed up switch expressions and look like this:
case <expression> of <pattern> -> <expression> ... <pattern> -> <expression>
<expression>is the thing we want to unpack, and the
patternis its concrete shape. For example, if we wanted to extract the
Stringout of the type
Structurewe defined in the exercise above, we do:
getStructureString :: Structure -> String getStructureString struct = case struct of Structure str -> str
This way we can extract the
Structureand return it.
In later chapters we'll introduce
datadeclarations (which are kind of a struct + enum chimera), where we can define multiple constructors to a type. Then the multiple patterns of a case expression will make more sense.
Alternatively, when declaring a function, we can also use pattern matching on the arguments:
func <pattern> = <expression>
getStructureString :: Structure -> String getStructureString (Structure str) = str
Using the types we created, we can change the HTML functions we've defined before, namely
p_, etc, to operate on these types instead of
But first let's meet another operator that will make our code more concise.
One very cool thing about
newtype is that wrapping and extracting expressions doesn't actually
have a performance cost! The compiler knows to remove any wrapping and extraction
newtype constructor and use the underlying type.
The new type and the constructor we defined are only there to help us distinguish between the type we created and the underlying type when we write our code, they are not needed when the code is running.
newtypes provide us with type safety with no performance penalty!
Another interesting and extremely common operator
(which is a regular library function in Haskell) is
. (pronounced compose).
This operator was made to look like the composition operator
you may know from math (
Let's look at its type and implementation:
(.) :: (b -> c) -> (a -> b) -> a -> c (.) f g x = f (g x)
Compose takes 3 arguments: two functions (named
g here) and
a third argument named
x. It then passes the argument
x to the second
g, and calls the first function
f with the result of
g takes as input something of the type
a and returns something of the type
something of the type
b, and returns something of the type
Another important thing to note is that types which start with
a lowercase letter are type variables.
Think of them as similar to regular variables. Just like
content could be any string, like
"world", a type variable
can be any type:
String -> String, etc.
This abilitiy is called parametric polymorphism (other languages often call this generics).
The catch is that type variables must match in a signature, so if for
example we write a function with the type signature
a -> a, the
input type and the return type must match, but it could be
any type - we cannot know what it is. So the only way to implement a
function with that signature is:
id :: a -> a id x = x
id, short for the identity function, returns the exact value it received.
If we tried any other way, for example returning some made up value
"hello", or try to use
x like a value of a type we know like
x + x, the type checker will complain.
Also, remember that
-> is right associative? This signature is equivalent to:
(.) :: (b -> c) -> (a -> b) -> (a -> c)
Doesn't it look like a function that takes two functions and returns a third function that is the composition of the two?
We can now use this operator to change our HTML functions. Let's start
with one example:
Before, we had:
p_ :: String -> String p_ = el "p"
And now, we can write:
p_ :: String -> Structure p_ = Structure . el "p"
p_ will take an arbitrary
String which is the content
of the paragraph we wish to create, will wrap it in
and then wrap it in the
Structure constructor to produce the
Structure (remember: newtype constructors can be used as functions!).
Let's take a deeper look and see what are the types of the two functions here are:
Structure :: String -> Structure
el "p" :: String -> String
Structure . el "p" :: String -> Structure
(.) :: (b -> c) -> (a -> b) -> (a -> c)
When we try to figure out if an expression type checks, we try to match the types and see if they work. If they are the same type, all is well. If one of them is a type variable and the other isn't we write down that the type variable should now be the concrete type, and see if everything still works.
So in our case we know from the type signature that the input type to
String and the output type is
ais equivalent to
~to denote equivalence), and
c ~ Structure
We also know that:
b ~ Stringbecause we pass
.as the first arguments, which means
String -> Structuremust match with the type of the first argument of
b -> c, so
b ~ Stringwhich fits with our previous knowledge from (3)
-> ~ ->
c ~ Structurewhich also fits with (2)
We keep doing this process until we come to the conclusion that there aren't any types that don't match (we don't have two different concrete types that are supposed to be equivalent).
Note: If we use a parametrically polymorphic function more than once, or use different functions that have similar type variable names, the type variables don't have to match in all instances simply because the share a name. Each instance has its own unique set of type variables. For example:
id :: a -> a ord :: Char -> Int chr :: Int -> Char incrementChar :: Char -> Char incrementChar c = chr (ord (id c) + id 1)
In the snippet above, we use
idtwice (for no good reason other than for demonstration purposes). The first
Charas an argument, and its
ais equivalent to
Char. The second
Intas an argument, and its distinct
ais equivalent to
This unfortunately only applies to functions defined at the top-level. If we'd define a local function to be passed as an argument to
incrementCharwith the same type signature as
id, the types must match in all uses. So this code:
incrementChar :: (a -> a) -> Char -> Char incrementChar func c = chr (ord (func c) + func 1)
Will not type check.
Before when we wanted to create richer HTML content and appended
nodes to one another, we used the append (
Since we are now not using
String anymore, we need another way
to do it.
While it is possible to overload
<> using a feature in
Haskell called type classes, we will instead create a new function
and call it
append_, and cover type classes later.
append_ should take two
Structures, and return a third
appending the inner
String in the first
Structure to the second and wrapping the result back in
append_ :: Structure -> Structure -> Structure append_ (Structure a) (Structure b) = Structure (a <> b)
After constructing a valid
Html value, we want to be able to
print it to the output so we can display it in our browser.
For that, we need to write a function that takes an
Html and converts it to a
String, which we can then pass to
render :: Html -> String render html = case html of Html str -> str
Let's look at one more way to give new names to types.
type definition looks really similar to a
newtype definition - the only
difference is that we reference the type name directly without a constructor:
type <type-name> = <existing-type>
For example in our case we can write:
type Title = String
type, in contrast with
newtype, is just a type name alias.
When we declare that
Title is a type alias of
We mean that
String are interchangeable,
and we can use one or the other whenever we want:
"hello" :: Title "hello" :: String
Both are valid in this case.
We can sometimes use
types to give a bit more clarity to our code,
but they are much less useful than
newtypes which allow us to
distinguish between two types that have the same type representation.
Try changing the code we wrote in previous chapters to use the new types we created.
We can combine
html_, and remove
html_, which can now have the type
Title -> Structure -> Html. This will make our HTML EDSL less flexible but more compact.
Alternatively, we could create
HtmlBodyand pass those to
html_, and we might do that at later chapters, but I've chose to keep the API a bit simple for now, we can always refactor later!
-- hello.hs main :: IO () main = putStrLn (render myhtml) myhtml :: Html myhtml = html_ "My title" ( append_ (h1_ "Header") ( append_ (p_ "Paragraph #1") (p_ "Paragraph #2") ) ) newtype Html = Html String newtype Structure = Structure String type Title = String html_ :: Title -> Structure -> Html html_ title content = Html ( el "html" ( el "head" (el "title" title) <> el "body" (getStructureString content) ) ) p_ :: String -> Structure p_ = Structure . el "p" h1_ :: String -> Structure h1_ = Structure . el "h1" el :: String -> String -> String el tag content = "<" <> tag <> ">" <> content <> "</" <> tag <> ">" append_ :: Structure -> Structure -> Structure append_ c1 c2 = Structure (getStructureString c1 <> getStructureString c2) getStructureString :: Structure -> String getStructureString content = case content of Structure str -> str render :: Html -> String render html = case html of Html str -> str
We have made some progress - now we can't write
where we'd expect either a paragraph or a header, but we can still
Structure "hello" and get something that isn't a
paragraph or a header. So while we made it harder for the user
to make mistakes by accident, we haven't really been able to enforce
the invariants we wanted to enforce in our library.
Next we'll see how we can make expressions such as
Structure "hello" illegal
as well using modules and smart constructors.