The Overflow

martes, 5 de enero de 2016

Hamlet and Plain Haskell

A coworker of mine just asked about my preference for generating markup in haskell web applications. I figured I would write up my response as a blog post. We are working on a yesod web application, so understand that this discussion takes place within that context. We are using WidgetT (yesod's html/css/javascript builder type).

There are two options we have:

Shakespearean templates: These is a template haskell solution provided by the shakespeare package.
Plain haskell: I have written a package named yesod-bootstrap. This is similar in spirit to lucid. However, it uses yesod's WidgetT instead of lucid's HtmlT.

Let's look at the difference between the two for a simple piece of html that uses several bootstrap classes and components. With hamlet, we can write:

<div.container>
  <div.col-md-9>
    <h1>Your page
    <p>This is the body copy
  <div.col-md-3>
    <h3>Sidebar
    <p>The sidebar copy goes here

Now we will do the same with plain haskell. It's worthing mentioning that WidgetT site IO () has an IsString instance. Also, we are assuming that Yesod.Bootstrap has been imported.

div_ [("class","container")] $ do
  div_ [("class","col-md-9")] $ do
    h1_ [] "Your page"
    p_ [] "This is the body copy"
  div_ [("class","col-md-3")] $ do
    h3_ [] "Sidebar"
    p_ [] "The sidebar copy goes here"

The two look remarkably similar. The main difference is that the plain haskell version is more verbose. Although, it is worth mentioning that by using some of the combinators in Yesod.Bootstrap, it could be rewritten as:

container $ do
  col [ColSize Medium 9] $ do
    h1_ [] "Your page"
    p_ [] "This is the body copy"
  col [ColSize Medium 3] $ do
    h3_ [] "Sidebar"
    p_ [] "The sidebar copy goes here"

The container and col functions help reduce some of the boilerplate involved with using bootstrap. Let's look at the type signature and definition of container:

container :: WidgetT site IO () -> WidgetT site IO ()
container = div_ [("class","container")]

At first glance, it seems like we should be able to use container in a shakespearean template. After all, the ^{..} widget interpolation should work. We would like to be able to write:

^{container}
  <p>Stuff in the container

But that doesn't work. The problem is that this construct only accepts a fully applied value. So we would have to write:

^{container stuffInContainer}

And stuffInContainer needs to be defined elsewhere. So doing this is possible using shakespearean templates, but this is a cumbersome way to produce markup.

To be clear, this is the drawback of shakespearean templates: if you use a Widget-producing function that accepts at least one other Widget as an argument, you can't keep all the markup in one place. In the original example, we didn't really feel any pain from this though. Let's turn to a more complicated example:

navbar NavbarDefault NavbarStaticTop HomeR "My Site"
  [ NavbarLink CustomersR $ "Customers " >> glyphicon "scale"
  , NavbarLink LearningR $ "Learn" >> glyphicon "education"
  , NavbarLink ContactR "Contact Us"
  ] []
container $ do
  h1_ [] "My Site Landing"
  p_ [] $ do
    "Welcome to this site. The features of this site are in "
    "the panel accordion below:"
  panelAccordion
    [ basicPanel "Lower Prices" $ do
        p_ [] $ "We have the " >> strong_ [] "lowest" >> "prices."
    , basicPanel "Best Features" $ do
        p_ [] $ "Our features are " >> em_ [] "outstanding" >> "."
    , basicPanel "Great Staff" $ do
        p_ [] $ do
          "Come in person or "
          anchor ContactR "contact us online"
          "."
    ]

I'm not going to attempt to recreate this using hamlet, but I'll point out what would be different:

The navbar function takes a list of NavbarItems as an argument. In hamlet, we couldn't split this up onto multiple lines.
The container function would have the same problem we discussed earlier. We could just write <div.container> instead.
The panelAccordion function takes a list of Panel site. These contain Widgets. We would encounter the same problem that we did when translating the navbar. We would need to declare it outside of the template itself. Both of these options hurt the readability of the template.

To conclude, I'm not against shakespearean templates in all situations. When I was first learning haskell, it was an immensely helpful way to deal with markup. As I started writing combinators to factor out common bootstrap idioms, I backed away from hamlet because I kept finding myself in situations where I had to split up template that were really more readable when everything was together. Now I prefer writing plain haskell to hamlet.

miércoles, 25 de noviembre de 2015

Haskell and Type-Safe Routing

This a reflection on my experiences with different approaches to type-safe routing in haskell. In each section, I will explain a technique and evaluate its merits and its detriments. Some of these (yesod's routing, boomerang, servant) are real libraries and others are mere thought experiments. They are presented roughly in order of increasing complexity. Let us begin.

Conventions

In the explanation of each technique, we will demonstrate how the routing and dispatch for the following routes would be built:

AllR: /products/:Bool
DetailsR: /product/:ProductId/view
BuyR: /buy/:ProductId/as/:UserId

The routes have the following functions:

AllR: Show all of the products, and if the boolean is true, only show recently added products.
DetailsR: View the details for a specific product
BuyR: Purchase a specific product as a specific user

Notice that route names end in an uppercase R. This is not required by any of the techniques I am going to discuss, but it is a common practice in yesod. I have found it to be helpful when working with other routing libraries, so it's a convention I'm going to use in this post.

Notice that the routes do not use Int for the Ids but instead use the following newtype wrappers:

newtype UserId = UserId Int
newtype ProductId = ProductId Int

This means that whenever we discuss a typeclass being used for path pieces, we will used GeneralizedNewtypeDeriving to provide instances for these.

Algebraic Data Types as Routes

The most straightforward approach to type-safe routing is to build an ADT:

data MyRoute
  = AllR Bool
  | DetailsR ProductId
  | BuyR ProductId UserId

Even though this isn't strictly necessary at this point, we're also going to create a typeclass for path pieces. This will help cut down on the noise introduced by the newtype wrappers:

class Piece a where
  toPiece :: a -> String
  fromPiece :: String -> Maybe a

instance Piece Bool where
  toPiece True  = "yes"
  toPiece False = "no"
  fromPiece "yes" = Just True
  fromPiece "no"  = Just False
  fromPiece _     = Nothing

instance Piece Int where
  toPiece = show
  fromPiece = readMaybe

deriving instance Piece UserId
deriving instance Piece ProductId

Next, we need to be able to render a route. So, we would write:

renderMyRoutePieces :: MyRoute -> [String]
renderMyRoutePieces r = case r of
  AllR b -> ["products",toPiece b]
  DetailsR pid -> ["product",toPiece pid,"view"]
  BuyR pid uid -> ["buy",toPiece pid,"as",toPiece uid]

Cool. And if you wanted to put the pieces together, you could build another function on top of it like this:

renderMyRoute :: MyRoute -> String
renderMyRoute = ('/' :) . intercalate "/" . renderMyRoutePieces

And now, in GHCi, we could type:

> renderMyRoute (DetailsR (ProductId 44))
/product/44/view
> renderMyRoute (AllR False)
/products/no

So far, there is nothing particularly painful about this approach. Let's see what happens when we try to handle parsing a route:

parsePieces :: [String] -> Maybe MyRoute
parsePieces ps = case ps of
  ["products",str1]       -> AllR <$> fromPiece str1
  ["product",str1,"view"] -> DetailsR <$> fromPiece str1
  ["buy",str1,"as",str2]  -> BuyR <$> fromPiece str1 <*> fromPiece str2
  _                       -> Nothing

And just like before, we could make a real parser that works on String instead of String. Instead of intersperse, we would use splitOn (from the split package). However, I'm going to skip doing the trivial plumbing:

parse :: String -> Maybe MyRoute
parse = ...

Dispatching is dead easy after we have parse. To keep it simple, I'm just going to have dispatch produce an IO ():

dispatch :: MyRoute -> IO ()
dispatch (BuyR pid uid) = putStrLn $
  "Buying product " ++ toPiece pid ++ " as user " ++ toPiece uid
dispatch (AllR b) = ...
dispatch (DetailsR pid) = ...

And then we can fuse parsing and dispatching together with:

parseAndDispatch :: String -> IO ()
parseAndDispatch s = case parse s of
  Just route -> dispatch route
  Nothing -> putStrLn "Bad route"

Cool. Let's try it out:

> parseAndDispatch "/foo/bar"
Bad route
> parseAndDispatch "/buy/44/as/12"
Buying product 44 as user 12

And that's it. We have now created the most naive solution to type safe routing. This is going to be a sort of baseline for the rest of the discussion. Let's evaluate its merits:

Type safety when creating routes
Since the Piece typeclass was merely for convenience, it is possible to parse/render the same type of piece differently in different routes. This flexibility is seldom needed, but it is available with the approach.

And now the detriments:

Adding a new route requires modifying code in at least four places (add data constructor, add MyRoute pattern matches to renderMyRoutePieces and dispatch, add list pattern match to parsePieces). If the dispatch function delegates work to a specially-named "handler" function for each route (which is almost always needed), it's five places.
Logic for parsing/rendering is duplicated.

The last detriment is pretty bad. In fact, it's the main problem that most of haskell's routing libraries try to solve. Let's look into the problem a little more. In renderMyRoutePieces, we have

BuyR pid uid -> ["buy",toPiece pid,"as",toPiece uid]

And in parsePieces, we have:

["buy",str1,"as",str2] -> BuyR <$> fromPiece str1 <*> fromPiece str2

These two pattern matches are eerily similar to one another. In fact, if I showed you only one of them, you should be able to tell me exactly what the other is supposed to look like. That's because they should be inverses of one another. If they don't have this property, then the routes that we render won't be correct. In general, the rendering function and the parsing function should satisfy this property:

parsePieces (renderMyRoutePieces r) 
  =
Just r

But the way that this routing technique works, it's completely possible to get it wrong. What we want is to write a single specification that can be used to generate both the renderer and the parser. Let's bring out the hammer.

Yesod Routes

Yesod's approach to routing is to use template haskell. This is probably the least elegant but most immidiately understandable solution to the problem. Why? Because it does almost the same thing that we just did. The differences are as follows:

The parse and dispatch step are unified. The route is parsed and then dispatch is called all in one step, without first recreating the MyRoute data type. This means that with yesod, there is neither an equivalent the parse function nor the dispatch function but only to the parseAndDispatch function.
A template haskell splices generates (1) the MyRoute data type declaration, (2) the route rendering function, and (3) the parse/dispatch function.
There is a typeclass named PathPiece which is equivalent to our Piece typeclass above. Any value that you capture in a route have a PathPiece instance.

That's basically it. Let's look at an example (we're going to use the same routes as before):

{-# LANGUAGE OverloadedStrings, QuasiQuotes #-}
{-# LANGUAGE TemplateHaskell, TypeFamilies  #-}
{-# LANGUAGE GeneralizedNewtypeDeriving #-}
import           Yesod

data MyApp = MyApp

newtype UserId = UserId Int
  deriving (Show,Read,Eq,PathPiece)
newtype ProductId = ProductId Int
  deriving (Show,Read,Eq,PathPiece)

mkYesod "MyApp" [parseRoutes|
/products/#Bool            AllR     GET
/product/#ProductId/view   DetailsR GET
/buy/#ProductId/as/#UserId BuyR     GET
|]

instance Yesod MyApp

getAllR b       = defaultLayout ...
getDetailsR pid = defaultLayout ...
getBuyR pid uid = defaultLayout ...

main = warp 3000 MyApp

There are some other features (there's a PathPieceMulti, there are subsites, etc.), but all that's really happening is something similar to what we wrote earlier. Let's take a look at some of the strengths of this approach:

It's now impossible to mess up the inversion issue with routing/parsing. One specification is used to build both. This is the biggest advantage.
Adding new routes is easier. Notably, we've gone down from needed to change five places to needing to change two places.

Now, the detriments:

We don't have a route parsing function that we can use anymore. Not too bad, since you likely only use it for dispatching anyway.
Magic. Magic is seldom good in programming. Things should follow a consistent set of rules, and you shouldn't need to learn new rules when you use new libraries. You do have to learn new rules to use yesod routing. Notice how in the example above, we had to prefix all of the handler names with the word get (getDetailsR, etc.). This is because the dispatch function (generated by template haskell) expects functions with those names to be in scope. If we instead had defined AllR with /products/#Bool AllR POST, then we would have had to define postAllR instead of getAllR. But, if we had defined it as /products/#Bool AllR (with no method at the end), then we would need to define handleAllR. Fortunately, yesod has pretty good documentation in the form of a cookbook, so it's not like these rules are unlearnable, but they are something you have to remember.
When you're using a complicated quaiquoters, the normal syntactical rules for haskell go away. For example, if AllR was supposed to take a Maybe Int instead of a Bool, what would you do? You might try /products/#Maybe Int AllR GET, but that doesn't work. You have to use hyphenation to apply a type: /products/#Maybe-Int AllR GET.

Now, I've been really hard on yesod in the detriments section, so I figured I would clarify something. I use yesod (with the template haskell routing) all the time at work in about half a dozen projects. I'm a contributor to yesod. Even though the routing uses template haskell and has tricky rules, if you learn what's going on, it's fine for most things. I know that it "doesn't compose", but for the most part, that hasn't been a problem for me.

Another thing to keep in mind is that yesod was around when GHC 7.2 was released. So a lot of the fancier type level tricks weren't possible back then.

However, returning to my original "tough on crime" stance, using template haskell in the way that yesod does makes things hard to understand and hard to change. Auto-generated documentation like haddock doesn't help because it can't know what the splice is going to bring into scope. There are other ways to use template haskell that are more restricted in scope and easier to understand. Notably, lens and aeson make use of template haskell in a way that reduces broilerplate and remains easy to understand.

I'm going to break this up into multiple parts because this is taking longer than I thought it would. In future installments, expect to see:

Routing with Applicative Functors (which don't work, but I figured the mental exercise would be nice).
Boomerang (maybe)
Servant
Serpentine (my own attempt at dependently typed routing)