Haskell and Type-Safe Routing
This a reflection on my experiences with different approaches to type-safe routing in haskell. In each section, I will explain a technique and evaluate its merits and its detriments. Some of these (yesod
's routing, boomerang
, servant
) are real libraries and others are mere thought experiments. They are presented roughly in order of increasing complexity. Let us begin.
Conventions
In the explanation of each technique, we will demonstrate how the routing and dispatch for the following routes would be built:
AllR: /products/:Bool
DetailsR: /product/:ProductId/view
BuyR: /buy/:ProductId/as/:UserId
The routes have the following functions:
AllR
: Show all of the products, and if the boolean is true, only show recently added products.DetailsR
: View the details for a specific productBuyR
: Purchase a specific product as a specific user
Notice that route names end in an uppercase R
. This is not required by any of the techniques I am going to discuss, but it is a common practice in yesod
. I have found it to be helpful when working with other routing libraries, so it's a convention I'm going to use in this post.
Notice that the routes do not use Int
for the Id
s but instead use the following newtype wrappers:
newtype UserId = UserId Int
newtype ProductId = ProductId Int
This means that whenever we discuss a typeclass being used for path pieces, we will used GeneralizedNewtypeDeriving
to provide instances for these.
Algebraic Data Types as Routes
The most straightforward approach to type-safe routing is to build an ADT:
data MyRoute
= AllR Bool
| DetailsR ProductId
| BuyR ProductId UserId
Even though this isn't strictly necessary at this point, we're also going to create a typeclass for path pieces. This will help cut down on the noise introduced by the newtype wrappers:
class Piece a where
toPiece :: a -> String
fromPiece :: String -> Maybe a
instance Piece Bool where
toPiece True = "yes"
toPiece False = "no"
fromPiece "yes" = Just True
fromPiece "no" = Just False
fromPiece _ = Nothing
instance Piece Int where
toPiece = show
fromPiece = readMaybe
deriving instance Piece UserId
deriving instance Piece ProductId
Next, we need to be able to render a route. So, we would write:
renderMyRoutePieces :: MyRoute -> [String]
renderMyRoutePieces r = case r of
AllR b -> ["products",toPiece b]
DetailsR pid -> ["product",toPiece pid,"view"]
BuyR pid uid -> ["buy",toPiece pid,"as",toPiece uid]
Cool. And if you wanted to put the pieces together, you could build another function on top of it like this:
renderMyRoute :: MyRoute -> String
renderMyRoute = ('/' :) . intercalate "/" . renderMyRoutePieces
And now, in GHCi, we could type:
> renderMyRoute (DetailsR (ProductId 44))
/product/44/view
> renderMyRoute (AllR False)
/products/no
So far, there is nothing particularly painful about this approach. Let's see what happens when we try to handle parsing a route:
parsePieces :: [String] -> Maybe MyRoute
parsePieces ps = case ps of
["products",str1] -> AllR <$> fromPiece str1
["product",str1,"view"] -> DetailsR <$> fromPiece str1
["buy",str1,"as",str2] -> BuyR <$> fromPiece str1 <*> fromPiece str2
_ -> Nothing
And just like before, we could make a real parser that works on String
instead of String
. Instead of intersperse
, we would use splitOn
(from the split
package). However, I'm going to skip doing the trivial plumbing:
parse :: String -> Maybe MyRoute
parse = ...
Dispatching is dead easy after we have parse
. To keep it simple, I'm just going to have dispatch produce an IO ()
:
dispatch :: MyRoute -> IO ()
dispatch (BuyR pid uid) = putStrLn $
"Buying product " ++ toPiece pid ++ " as user " ++ toPiece uid
dispatch (AllR b) = ...
dispatch (DetailsR pid) = ...
And then we can fuse parsing and dispatching together with:
parseAndDispatch :: String -> IO ()
parseAndDispatch s = case parse s of
Just route -> dispatch route
Nothing -> putStrLn "Bad route"
Cool. Let's try it out:
> parseAndDispatch "/foo/bar"
Bad route
> parseAndDispatch "/buy/44/as/12"
Buying product 44 as user 12
And that's it. We have now created the most naive solution to type safe routing. This is going to be a sort of baseline for the rest of the discussion. Let's evaluate its merits:
- Type safety when creating routes
- Since the
Piece
typeclass was merely for convenience, it is possible to parse/render the same type of piece differently in different routes. This flexibility is seldom needed, but it is available with the approach.
And now the detriments:
- Adding a new route requires modifying code in at least four places (add data constructor, add
MyRoute
pattern matches torenderMyRoutePieces
anddispatch
, add list pattern match toparsePieces
). If the dispatch function delegates work to a specially-named "handler" function for each route (which is almost always needed), it's five places. - Logic for parsing/rendering is duplicated.
The last detriment is pretty bad. In fact, it's the main problem that most of haskell's routing libraries try to solve. Let's look into the problem a little more. In renderMyRoutePieces
, we have
BuyR pid uid -> ["buy",toPiece pid,"as",toPiece uid]
And in parsePieces
, we have:
["buy",str1,"as",str2] -> BuyR <$> fromPiece str1 <*> fromPiece str2
These two pattern matches are eerily similar to one another. In fact, if I showed you only one of them, you should be able to tell me exactly what the other is supposed to look like. That's because they should be inverses of one another. If they don't have this property, then the routes that we render won't be correct. In general, the rendering function and the parsing function should satisfy this property:
parsePieces (renderMyRoutePieces r)
=
Just r
But the way that this routing technique works, it's completely possible to get it wrong. What we want is to write a single specification that can be used to generate both the renderer and the parser. Let's bring out the hammer.
Yesod Routes
Yesod's approach to routing is to use template haskell. This is probably the least elegant but most immidiately understandable solution to the problem. Why? Because it does almost the same thing that we just did. The differences are as follows:
- The parse and dispatch step are unified. The route is parsed and then dispatch is called all in one step, without first recreating the
MyRoute
data type. This means that with yesod, there is neither an equivalent theparse
function nor thedispatch
function but only to theparseAndDispatch
function. - A template haskell splices generates (1) the
MyRoute
data type declaration, (2) the route rendering function, and (3) the parse/dispatch function. - There is a typeclass named PathPiece which is equivalent to our
Piece
typeclass above. Any value that you capture in a route have aPathPiece
instance.
That's basically it. Let's look at an example (we're going to use the same routes as before):
{-# LANGUAGE OverloadedStrings, QuasiQuotes #-}
{-# LANGUAGE TemplateHaskell, TypeFamilies #-}
{-# LANGUAGE GeneralizedNewtypeDeriving #-}
import Yesod
data MyApp = MyApp
newtype UserId = UserId Int
deriving (Show,Read,Eq,PathPiece)
newtype ProductId = ProductId Int
deriving (Show,Read,Eq,PathPiece)
mkYesod "MyApp" [parseRoutes|
/products/#Bool AllR GET
/product/#ProductId/view DetailsR GET
/buy/#ProductId/as/#UserId BuyR GET
|]
instance Yesod MyApp
getAllR b = defaultLayout ...
getDetailsR pid = defaultLayout ...
getBuyR pid uid = defaultLayout ...
main = warp 3000 MyApp
There are some other features (there's a PathPieceMulti
, there are subsites, etc.), but all that's really happening is something similar to what we wrote earlier. Let's take a look at some of the strengths of this approach:
- It's now impossible to mess up the inversion issue with routing/parsing. One specification is used to build both. This is the biggest advantage.
- Adding new routes is easier. Notably, we've gone down from needed to change five places to needing to change two places.
Now, the detriments:
- We don't have a route parsing function that we can use anymore. Not too bad, since you likely only use it for dispatching anyway.
- Magic. Magic is seldom good in programming. Things should follow a consistent set of rules, and you shouldn't need to learn new rules when you use new libraries. You do have to learn new rules to use yesod routing. Notice how in the example above, we had to prefix all of the handler names with the word
get
(getDetailsR
, etc.). This is because the dispatch function (generated by template haskell) expects functions with those names to be in scope. If we instead had definedAllR
with/products/#Bool AllR POST
, then we would have had to definepostAllR
instead ofgetAllR
. But, if we had defined it as/products/#Bool AllR
(with no method at the end), then we would need to definehandleAllR
. Fortunately, yesod has pretty good documentation in the form of a cookbook, so it's not like these rules are unlearnable, but they are something you have to remember. - When you're using a complicated quaiquoters, the normal syntactical rules for haskell go away. For example, if
AllR
was supposed to take aMaybe Int
instead of a Bool, what would you do? You might try/products/#Maybe Int AllR GET
, but that doesn't work. You have to use hyphenation to apply a type:/products/#Maybe-Int AllR GET
.
Now, I've been really hard on yesod in the detriments section, so I figured I would clarify something. I use yesod (with the template haskell routing) all the time at work in about half a dozen projects. I'm a contributor to yesod. Even though the routing uses template haskell and has tricky rules, if you learn what's going on, it's fine for most things. I know that it "doesn't compose", but for the most part, that hasn't been a problem for me.
Another thing to keep in mind is that yesod was around when GHC 7.2 was released. So a lot of the fancier type level tricks weren't possible back then.
However, returning to my original "tough on crime" stance, using template haskell in the way that yesod does makes things hard to understand and hard to change. Auto-generated documentation like haddock doesn't help because it can't know what the splice is going to bring into scope. There are other ways to use template haskell that are more restricted in scope and easier to understand. Notably, lens
and aeson
make use of template haskell in a way that reduces broilerplate and remains easy to understand.
I'm going to break this up into multiple parts because this is taking longer than I thought it would. In future installments, expect to see:
- Routing with Applicative Functors (which don't work, but I figured the mental exercise would be nice).
- Boomerang (maybe)
- Servant
- Serpentine (my own attempt at dependently typed routing)
No hay comentarios:
Publicar un comentario