Tobin Yehle | Dominion AI

One of the projects in Matt Flatt’s functional programming studio class was to make an AI for the deck building game Dominion. We were each tasked with building an executable that would read what other players were doing on standard in, and communicate moves on standard out. The server that controlled all of the state of the game, and specified the protocol by which the processes would communicate was fixed. I chose to implement my client in Haskell.

There are three distinct pieces of the program following an architecture similar to the standard model-view-controller architecture. In this case there is a model responsible for representing the world state in the program, a parser similar to the standard view, but in this case it is responsible for I/O from the server instead of a user, and a two part controller. The controller is split into a driver side that contains all the code for dealing with I/O, and an agent side that is responsible for deciding how to play.

All source code is publicly available on github

Model

In order for the AI to make decisions it needs to know what the state of the game is. Every turn the server would send all of the known information about the game state to the client, so a client could be a simple reflex agent. This removes the difficulties of passing information along to the AI in the future in a purely functional way. I chose to represent the information received from the server as follows.

The most basic item in the game is probably the cards. I chose to represent them as a single union type. To simplify things we only dealt with the 10 suggested starting cards.

data Card = Cellar | Moat | ... | Gold | Province deriving (Eq, Show)

This representation makes no distinction between the victory, treasure, and kingdom cards, but such a system can get very complex for not much benefit.

I set up the Action type to represent any valid action a player can perform on their turn. They can Add treasures to their pool, Clean their hand, ending their turn, possibly showing one of any remaining cards in their hand, Buy a card from the supply, or Act by playing a kingdom card.

data Action = Add Card
            | Clean (Maybe Card)
            | Buy Card
            | Act Card [Card] deriving (Eq)

With the basic set of cards, a player can only be attacked by a Militia, and the only options they have are to discard, or to reveal a Moat.

data Defense = Reveal Card | Discard [Card] deriving (Eq)

Finally there is a GameState record that holds all the information about the current state of the game, and a Notification type. A Notification represents anything the server could send to the client. The server can Request an action from the client, Update the client about the actions of other players, Ask the client to defend itself if Attacked, or update the client that another player Defended them self.

data Notification = Request GameState
                  | Update String Action
                  | Attacked Action String GameState
                  | Defended String Defense deriving (Eq, Show)

This internal representation is the target of the parser that reads from standard in.

Parser and Output Format

The client uses a parser to read data from standard in, and Show instances defined on Action and Defense to build responses to put on standard out.

The show instances simply put the data into an s-expression to be written to standard out.

-- Wraps a list of strings in parens, and lower cases all of them
wrap :: [String] -> String
wrap ss = "(" ++ intercalate " " (map (map toLower) ss) ++ ")"

instance Show Action where
    show (Add c) = wrap ["add", show c]
    show (Clean (Just c)) = wrap ["clean", show c]
    show (Clean Nothing) = "(clean)"
    show (Buy c) = wrap ["buy", show c]
    show (Act c cs) = wrap $ "act" : show c : map show cs

instance Show Defense where
    show (Reveal c) = wrap [show c]
    show (Discard cs) = wrap $ "discard":(map show cs)

The parser module is responsible for taking valid input from standard in, and transforming it into a Notification. The interface it presents to the rest of the program is the function parseNotification which takes an input string, parses a notification and returns the pair of the parsed notification and any remaining input.

The details of the parser are long and boring, and all I will say here is that I quite like parser combinators. They make building a parser for a protocol like this much easier. My favorite explanation of what they are is here.

parseNotification :: String -> (Notification, String)
parseNotification input
    | Left e       <- parsed = error . show $ e
    | Right result <- parsed = result
    where
        parsed = parse (withRemaining notification) "stdin" input

withRemaining :: Parser a -> Parser (a, String)
withRemaining p = do
    result <- p
    rest <- getInput
    return (result, rest)

The first iterations of the server did not put any spaces between the notifications. It just wrote a single s-expression to the client, so in order to parse correctly it was necessary to read the input character by character, instead of using more common line reading functions. This created an interesting problem because if the client attempted to read a line the whole system would hang. The server would write the request to the client as a single s-expression "(move (...))", and then wait for the client to respond. Since there was no new line at the end of the s-expression the client, who had attempted to read a line, would be waiting for more input.

There were two ways to solve this problem. In an eager language the natural solution is to read standard in character by character, instead of waiting for a line. The reader would keep a counter of parentheses to determine where the end of the s-expression was. This solution is certainly possible in Haskell, but the lazy evaluation of strings presents a much more elegant solution.

In Haskell it is possible to get a lazy pointer to an entire buffer. This operation consumes the buffer, and in the case of standard in results in a string. This string, representing all of standard in over the lifetime of the program, is the string passed to parseNotification. The parser will remove all the characters containing the notification from the beginning of the string, and then return the parsed structure and any remaining input. In this case the remaining input is a lazy pointer to any remaining characters from standard in. Attempting to read any more characters from the string at this point would cause the program to hang because there aren’t more characters yet. The server hasn’t written them because it is waiting for a response from the client. Instead of reading the remainder of the string the parser simply returns it to be used later.

Driver

The driver code is responsible for all I/O and connects the parser to the agent. For now it is enough to know that there is a typeclass Agent, and it defines the function respond :: (Agent a) => a -> Notification -> String. I wanted to test my code out on different agents, so most of the code that would normally go in main got its own function drive. drive takes an agent, and runs the program, leaving the only code in main the choice of agent.

main :: IO ()
main = drive Miner49er

drive :: (Agent a) => a -> IO ()
drive agent = do
    hSetBuffering stdin NoBuffering
    hSetBuffering stdout NoBuffering
    hGetContents stdin >>= runClient agent

drive ensures there is no buffering on the I/O channels, and then gets a lazy pointer to all of standard in with hGetContents and hands that off to the runClient function which deals with looping responses from the client.

runClient :: (Agent a) => a -> String -> IO ()
runClient agent input
    | null trimmed = return ()
    | otherwise    = (putStr . respond agent) notification
                     >> runClient agent moreInput
    where
        trimmed = dropWhile isSpace input
        (notification, moreInput) = parseNotification trimmed

runClient will return if the server has closed the input stream, otherwise it prints the agents response, and loops on the remaining input from the server. Trimming the input is necessary because eventually the server was changed to emit new lines between each request, making it much easier to interface with, but by that time I had already built my parser, so I just remove all whitespace between the server’s messages.

Agent

The agent is responsible for deciding how to play dominion. The strategy I used is called big money smithy, and details about it can be found here, and how to beat it here. The basic strategy is to buy some small number of smithies, and treasure cards until you can buy provinces to end the game. It is surprisingly effective.

There are three distinct phases of a turn in dominion, during the first a player uses their actions to play kingdom cards, then they play their treasure cards, and then they can buy new cards from the supply. When their turn is over they discard any remaining cards in their hand, revealing one of them to the other players because it goes face up on their discard pile.

I wanted to make it easy to quickly define a new strategy, so I tried to generalize all of the decisions an agent would have to make. I came up with the following typeclass that I designed to be general enough for any strategy.

class Agent a where
    tryAction :: a -> GameState -> Either Action GameState
    tryAdd :: a -> GameState -> Either Action GameState
    tryBuy :: a -> GameState -> Either Action GameState
    tryDefend :: a -> GameState -> Either Defense GameState
    discardTo :: a -> GameState -> Int -> Defense

This requires an agent to define how to discard if necessary, and how to play actions, treasures, buys, and defenses using Either as a monad with “failure” being a valid response. This use of Either allows the implementation of playing actions to be very simple.

act :: (Agent a) => a -> GameState -> Action
act a state = case tryAction a state >>= tryAdd a >>= tryBuy a of
    Left action  -> action
    Right _      -> Clean $ find (const True) (hand state)

If none of tryAction, tryAdd, or tryBuy “fail” with a valid response the turn is over and the discard message is returned.

Playing Kingdom Cards

The simplest of the kingdom cards do not affect any other cards. There are more complex cards, like the mine or the cellar that require the player to do something with their hand. This function cannot deal with cards like that. For my strategy this is good enough because the only action card I will play is the smithy.

trySimplePlay :: Card -> GameState -> Either Action GameState
trySimplePlay card state
    | canAct state card = Left $ Act card []
    | otherwise = Right state

inHand :: GameState -> Card -> Bool
inHand state card = card `elem` hand state

canAct :: GameState -> Card -> Bool
canAct state card = actions state > 0 && inHand state card

Playing Treasures

I chose to always play all the treasure cards in my hand, which made this phase very easy

playAllTreasures :: GameState -> Either Action GameState
playAllTreasures state = case find isTreasure (hand state) of
    Just t -> Left $ Add t
    _ -> Right state

Buying

This is arguably where most of the strategy of the game is, and is also where my code is the ugliest. I decided to use a system of card priority with a function to determine if a card should be bought based on the game state.

buyPriority :: (GameState -> Card -> Bool) -> [Card] -> GameState -> Either Action GameState
buyPriority should priority state
    | buys state > 0 = case find (\c -> canBuy c && should state c) priority of
        Just c -> Left (Buy c)
        Nothing -> Right state
    | otherwise = Right state
    where
        canBuy c = cost c <= coins state && c `elem` supply state

If the agent has no buys left then it just passes on to the next phase, otherwise it searches through the prioritized list of cards and buys the first one that it can buy and should buy. The should buy function is provided by the agent instance, and allows it to control when to buy certain cards. I have found the ability to pass a function to a function like this to define part of its behavior an extremely useful pattern that allows for much more flexible implementations.

Discarding

The last interesting question to answer is how to discard cards if the player is attacked, and they cannot defend with a moat. I settled on a similar strategy as buying, allowing the agent instance to specify a priority of cards to discard.

discardPriority :: [Card] -> GameState -> Int -> Defense
discardPriority priority state n = Discard (findDiscards (length (hand state) - n)
                                                         (hand state)
                                                         priority)

findDiscards :: Int -> [Card] -> [Card] -> [Card]
findDiscards n hand toTry
    | n <= 0     = []
    | null toTry = take n hand
    | otherwise  = (take n toRemove) ++ findDiscards (n - length toRemove) toKeep (tail toTry)
    where
        (toRemove, toKeep) = partition (== (head toTry)) hand

This is a function that recursively removes cards in the order of the list toTry if it runs out of cards to try it takes from the beginning of the hand.

Defining Agent Instances

With all of these default implementations in place my smithy bot now only took a few lines to define. My goal was to build an agent that would:

always buy a province if it can
only buy smithies if less than 5% of its cards are smithies
avoid buying gold and silver in the late game
only buy duchies and estates at the end of the game
discard its least valuable cards first

data SmithyMoney = SmithyMoney

instance Agent SmithyMoney where
    tryAction _ = trySimplePlay Smithy

    tryAdd _ = playAllTreasures

    tryBuy _ = buyPriority shouldBuy priority
        where
            priority = [Province, Gold, Smithy, Duchy, Silver, Estate]
            shouldBuy _ Province = True
            shouldBuy state Gold = numInSupply state Province > 4
            shouldBuy state Smithy = probDraw (allMyCards state) Smithy < 0.05
            shouldBuy state Duchy = numInSupply state Province < 6
            shouldBuy state Silver = numInSupply state Province > 4
            shouldBuy state Estate = numInSupply state Province < 4

    tryDefend _ = revealMoat

    discardTo _ state = discardPriority ([Province, Duchy, Estate] ++ valueSorted) state
        where
            valueSorted = sortOn value [Copper, Silver, Gold, Smithy]
            value Smithy = expectedDrawValue state 3
            value t = treasureWorth t

And thats it.

I also made a different agent that uses some more of the kingdom cards just to see how it would do. I won’t show the whole definition here, but I think it is interesting to see how actions can be chained together even in the agent instance using bind.

    tryAction _ state = trySimplePlay Village state >>= trySimplePlay Militia >>= trySimplePlay Smithy >>= tryMine

To fine tune my smithy bot I ran it in batches of 50 games against this agent, and got it to the point where it would win about 95% of the time, and lose about 1% of the time.

Testing

I used HUnit and Tasty as a testing framework, which all fits in very nicely with stack. I wrote many tests, and they were invaluable in helping me write correct code, but they are also not very interesting. I found it quite annoying that I could not access a function for testing if it was not exposed by a module, and this lead to me exposing nearly every piece of code I wrote. I feel like there must be a better way.

Edit: The Internal/External module pattern seems to be the way to do this. This answer has a great explanation of how that should work.

Further Thoughts

After defining my agent instances it seems that most of the strategy of the game is not in how to play cards from a hand, but knowing what deck composition to aim for. A possible improvement to the project would be to write an agent interface that requires only an early game, mid game, and late game deck composition and attempts to satisfy it. This would make the internals of the agent much more complected, but would make tuning and testing a strategy much easier.