Generic Programming
Generic programming is a way of automatically deriving useful typeclass instances for custom datatypes. This is often a pleasant alternative to using macros, which are not very Haskelly.
This means Haskell can automatically work out how to do things like:
- serialize data of your custom type
- parse it (from JSON, from command line args)
- memoize functions that take it as input
- sort and group it in O(n) time (possible for many types using variations of radix sort)
- write lenses for it
- generate examples of it for property tests
When writing a large codebase with many custom types, generic programming take remove a huge amount of "boilerplate" code. It also makes refactoring vastly easier, because substantial changes to a type's definition do not require any rewriting of manual instance definitions.
Warning
Errors messages that arise from a mistake involving generic programming constructs can often be very hard to interpret without knowledge of implementation details. This is the main downside to their use.
The following gives an example of all these abilities. Here are the imports and extensions that are needed:
Imports¶
-- some extensions that are needed
{-# LANGUAGE BlockArguments, DerivingVia, DerivingStrategies, TypeFamilies, LambdaCase, DeriveAnyClass, DataKinds, OverloadedStrings #-}
import Data.Set (Set)
import GHC.Generics (Generic)
import Data.Generics.Sum.Any
import Data.Generics.Product.Any
import Data.Discrimination
import Data.MemoTrie
import Test.QuickCheck.Arbitrary.Generic
import Test.QuickCheck.Arbitrary
import Test.QuickCheck
import Generic.Data
import Data.Serialize (Serialize, encode, decode)
import Control.Lens ((^.), (^?), (^..))
import Data.ByteString (ByteString)
import Data.Constraint
import Data.Data (Proxy(..))
import Data.Functor.Compose (Compose)
import Options.Generic (ParseRecord, getRecord)
import qualified Data.List as List
An example using a custom type¶
Suppose we have some custom types like Position
and UserInput
:
We might declare a type like this to represent user input. The idea of generic programming is to avoid having to write "boilerplate" code to
Standard deriving¶
A familiar feature of Haskell in general is that we can automatically derive instances for certain special classes like Eq
, Ord
and Show
(see more here).
Note
Here, the deriving
expressions are standalone statements. One can also place them as part of a definition, as in:
Generic serialization¶
Generic programming extends deriving to many more typeclasses. For example, we can derive functions to (un)serialize our type:
deriving instance Generic UserInput -- (1)!
deriving instance Serialize UserInput
serializedData :: ByteString
serializedData = encode (Keys ['a', 'b'])
unserializedData :: Either String UserInput
unserializedData = decode serializedData
- This is the instance which allows other instances, like
Serialize
to be defined in an automatic way.
This is like pickle
in Python, but for arbitrary types, not just special pre-specified ones. We can also serialize to readable JSON, using the aeson
library.
Generic parsing¶
Next, we can automatically create a command line parser for our type:
instance ParseRecord UserInput
commandLineParser :: IO ()
commandLineParser = do
inp <- getRecord "Test program"
print (inp :: UserInput)
When commandLineParser
is called in the main
of the executable, you can now run with command line arguments:
> cabal run executableName -- mouse --down
Mouse {down = True, moving = False}
> cabal run executableName -- keys "ab"
Keys "ab"
Generic memoization¶
Next, we can memoize (using entirely pure code!) costly functions that take this type as input. For example, suppose we have a function:
costlyFunction :: UserInput -> Int
costlyFunction = \case
Keys ['a','b'] -> head $ reverse [1..100000000]
_ -> 0
instance HasTrie UserInput where
newtype (UserInput :->: b) = UserInputTrie { unUserInputTrie :: Reg UserInput :->: b }
trie = trieGeneric UserInputTrie
untrie = untrieGeneric unUserInputTrie
enumerate = enumerateGeneric unUserInputTrie
example = do
let memoizedCostlyFunction = memo costlyFunction
print (memoizedCostlyFunction (Keys ['a','b']))
print (memoizedCostlyFunction (Keys []))
-- no need to recompute
print (memoizedCostlyFunction (Keys ['a','b']))
Here, we did write an explicit instance for HasTrie
(the class required in order to use the memoization function memo
), but the definitions in the instance, like untrieGeneric
, are all given via the Generic
instance.
Generic sorting and grouping¶
Next, we can efficiently sort or group a list of UserInput
s. For many types, the normal bound of \(O(n \log n)\) can be reduced to \(O(n)\) by clever variations of radix sort, and remarkably, Haskell can automatically derive these for you.
deriving instance Grouping UserInput
deriving instance Sorting UserInput
unique :: [UserInput]
unique = nub [Mouse True True, Keys ['a','b'], Mouse True True]
Generic lenses¶
The Generic
instance also makes it possible to automatically derive lenses:
value1 = Keys ['a', 'b']
value2 = Mouse True False
keys :: [[Char]]
keys = value1 ^.. _As @"Keys" -- (1)!
-- > [['a','b']] (3)
isMoving :: [Bool]
isMoving = value1 ^.. _As @"Mouse" . the @2
-- > []
isMoving2 :: [Bool]
isMoving2 = value2 ^.. _As @"Mouse" . the @2 -- (2)!
-- > False
- You need to use
^..
instead of^.
because there might be no subpart of aUserInput
value of the formKeys _
, as in the valueMouse True True
. the @2
selects the second value inMouse True False
.- The repl will print this as ["ab"], because "ab" is a way of writing ['a','b'] in Haskell.
Generic property testing¶
Finally, we can generate examples of UserInput
to use for property tests. For instance, we could generate a random list of UserInput
s, and check that no matter which one we generate, both the \(O(n)\) sort and the standard \(O(n\log n)\) sort agree:
deriving via GenericArbitrary UserInput instance Arbitrary UserInput -- (1)!
main :: IO ()
main = do
quickCheck (\ls -> (sort @[UserInput] ls == List.sort ls))
-- define a property of "idempotency", which means: doing an operation twice is the same as doing it once
let idempotent f x = f x == f (f x)
quickCheck (idempotent (sort @[UserInput]))
quickCheck (idempotent (nub @[UserInput]))
- This
via
syntax uses theDerivingVia
extension. See more here.