Toronto Library New Holdable DVDs With Haskell

May 19, 2013 / Mad Coding, Haskell

The Toronto library doesn’t allow you to put a hold on newer DVDs until several months later. When a new batch of DVDs becomes holdable on the 15th of every month they’ll show up on this list. This weekend I was reminded of the idea to write a program to automatically grab the list and check Rotten Tomatoes for ratings to decide what movies to put on hold. I started writing the program in Haskell and my progress so far is now on my github. What’s missing right now is actually logging in and place a hold.

Use HTTP.Conduit to fetch web pages

I used http-conduit to grab the HTML source from Toronto library website. It’s pretty straightforward. Just install it by cabal install http-conduit, then use the simpleHttp function.

import Network.HTTP.Conduit

main :: IO ()
main = do
    content <- simpleHttp newMoviesURL

Use regex-tdfa for regular expressions

Whenever I need to use regex to extract data from HTML source code, I used regex-tdfa’s =~ function.


import qualified Data.ByteString.Lazy as L
import Text.Regex.TDFA ((=~))

updated :: L.ByteString -> L.ByteString
updated s = if length matches > 0
            then last $ head matches
            else L.empty
              where matches = s =~ "<h3[^>]*>Updated (.*)</h3>"

Use Data.Aeson for parsing JSON

For parsing Rotten Tomatoes JSON API data, I used aeson package for that. Install it by cabal install aeson. Below is how I mapped the JSON result to what I need.

First the declarations:

{-# LANGUAGE DeriveGeneric #-}

import Data.Aeson (FromJSON, ToJSON, decode, encode)
import GHC.Generics (Generic)

data RTCast = RTCast {
      name :: L.ByteString
    } deriving (Show, Generic)

data RTMovie = RTMovie {
      year :: Int
    , ratings :: RTRatings
    , abridged_cast :: [RTCast]
    } deriving (Show, Generic)

data RTInfo = RTInfo {
      movies :: [RTMovie]
    } deriving (Show, Generic)

instance FromJSON RTInfo
instance FromJSON RTMovie
instance FromJSON RTCast
instance FromJSON RTRatings

then to actually decode:

let rt = decode content :: Maybe RTInfo