This year I participated in the Advent of Code challenge, where once a day between the 1st and 25th of December a pair of new code challenge is released. Usually the second problem requires that the solution to the first problem be refactored or optimized in order to solve the second. I solved most of the problems this year in Haskell, one of my favorite programming languages. While I probably won’t be publishing all of my solutions 1, I wanted to write up and share some of the more interesting pieces of my solutions.
The infamous Monad is well known for it’s confusing name and countless useless tutorials. I won’t attempt to explain what it is here, but to explain why it is useful to have it available as an abstraction. One common objection to Monads is that, while they do seem to capture a shared interface and abstraction, it’s hard to see what benefit is derived from using Monads. In Haskell, it’s easy enough to say that the do-notation syntactic sugar requires the Monad interface and no more, but why are some people so intent on getting Monads working in Rust, OCaml, and even Java and Python?
To see why, let’s look at one of the refactors in Advent of Code’s day 14. Part 1 of this challenge introduces the operation of “masking” an integer. It gives us, as an example,
value: 000000000000000000000000000000001011 (decimal 11) mask: XXXXXXXXXXXXXXXXXXXXXXXXXXXXX1XXXX0X result: 000000000000000000000000000001001001 (decimal 73)
We can see that an X indicates that the bit should remain unchanged, a 1 indicates that the bit at that position should be overwritten with a 1, and a 0 indicating that the bit at that position should be overwritten with a 0. Using the
Data.Bits module, this is fairly simple to implement in Haskell.
import Data.Bits import Data.Foldable maskBit :: Bits a => a -> (Int, Char) -> a 'X') = x maskBit x (_, '0') = clearBit x i maskBit x (i, '1') = setBit x i maskBit x (i, mask :: Bits a => String -> a -> a = foldl maskBit original mask maskSpec original zip [0..] (reverse maskSpec)) (
in the above code, we have a function which sets, clears, or ignores a specified bit in some value
x. Then, in the second function, we use a left fold. The three arguments to the mask function are the
maskBit function, the
original value, and a list of the characters in the
maskSpec string paired with their indexes. (But reversed, so that
[(0,'X'), (1,'0'), (2, 'X'), (3, '1')].) The left fold will call
maskBit passing in the original value and the first element in the list, and then pass the result of that back into
maskBit with the second element in the list, and so on until the list is exhausted.
Take a moment to look at the code and really understand it. Note a few things:
- This code is generic over what type we are setting the bits of. In my full solution for day 14 and in the example, it’s always an integer, but it could be anything with a specified bit structure. Hopefully I’ll get around to writing up why I wrote it this way in a future post.
- The direction of the fold doesn’t really matter here. I could have used a right fold if I rewrote maskBit to have it’s arguments swapped. However, in my actual code, I used
foldl', which for operations like this is much faster performance-wise than
Now let’s see how this operation changes in part 2 of day 14. The meaning of our mask has indeed changed:
If the bitmask bit is 0, the corresponding bit is unchanged.
If the bitmask bit is 1, the corresponding bit is overwritten with 1.
If the bitmask bit is X, the corresponding bit is floating.
A floating bit is not connected to anything and instead fluctuates unpredictably. In practice, this means the floating bits will take on all possible values.
In the example, we see that this means that we no longer apply the mask and get a single value out, but instead we get many values out:
value: 000000000000000000000000000000101010 (decimal 42) mask: 000000000000000000000000000000X1001X result: 000000000000000000000000000000X1101X
Or, written another way:
results: 000000000000000000000000000000011010 (decimal 26) 000000000000000000000000000000011011 (decimal 27) 000000000000000000000000000000111010 (decimal 58) 000000000000000000000000000000111011 (decimal 59)
Now, how do we need to refactor our code from part 1 to do part 2? Refactoring
maskBit is fairly easy. Instead of returning one result, we return a list of the possible results.
import Data.Bits import Data.Foldable maskBit2 :: Bits a => a -> (Int, Char) -> [a] 'X') = [setBit x i, clearBit x i] maskBit2 x (i, '0') = [x] maskBit2 x (_, '1') = [setBit x i]maskBit2 x (i,
mask is slightly more of a pain. We have to stop using
foldl, because our
maskBit2 function no longer takes the same argument as it returns. Using recursion, we can do the following:
mask2 :: Bits a => String -> a -> [a] = go (length mask) original mask2 mask original where = reverse mask reversed -1) value = value go (= let go i value = maskBit2 value (i, reversed !! i) values in join (map (go (i-1)) values)
This works just fine, especially since this is Haskell and recursion is sort of expected. Some notes about this implementation:
gois a Haskell convention for helper functions that are declared within the scope of a top-level function. They are often recursive, but need different arguments from the top-level function.
joinis equivalent to flatten in many other languages. While it has a highly abstract type, here it’s equivalent to
join :: [[a]] -> [a].
However, there’s actually a second definition, which both saves lines of code and time when you’re competing in a coding competition.
mask2 :: Bits a => String -> a -> [a] = foldlM maskBit2 original mask2 maskSpec original zip [0..] (reverse maskSpec)) (
This is identical to
mask except for the type, and with
foldl swapped out for
foldlM. What is
foldlM and what does it do?
Well, just like
foldl, it folds over a list and accumulates a single value. However, unlike
foldl, its first parameter is a function of type
Monad m => a -> b -> m a instead of
a -> b -> a, and returns
m a instead of
a. If you look at Monads from an “effectful computation” point of view, it’s a fold that can perform effects while it folds. For example, you could use this to concatenate the contents of a list of files.
concatenated :: [String] -> IO String = foldlM (\filename string -> do concatenated filenames <- readFile filename s return (string ++ s) "" filenames )
However, the function we’re passing into
foldlM is of type
Bits a => String -> a -> [a], which means we’re using the list Monad here. For effectful computations, like IO or State, it makes sense to fold over a list while performing some effects. What does it mean to fold over a list while performing the effects of a list?
Here we have to turn to the fact that lists do represent an effectful computation. Specifically, they represent a branching or forking computation. To demonstrate this, let’s first talk about Maybe’s Monad effects.
Maybe (and Either) represent fallible computations. In the sense of a data structure, they contain a single value or they contain nothing (or an error message). In the computational sense, they are fallible, and using that sense, we can infer what
foldlM does when used with them. It’s a fold that can at some time fail. So for instance, if we wrote:
divideBy :: Int -> [Int] -> Maybe Int = foldlM divOpt original divisors divideBy original divisors where divOpt x d = if x `mod` d == 0 then Just (x `div` d) else Nothing
We can see that this divides the original value by each divisor, but only if it divides the value cleanly. Otherwise, it fails, returning
From this perspective, we can draw parallels to lists. A computation that returns the empty list is just like a computation that returns
Nothing. No value of type
a is available, even though the empty list is of type
Nothing is of type
Maybe a. In the case of the value
Just x, we have a single value of type
a available. So lists can still model fallible computations. But lists can also have multiple values. What does a computation that yields
[x, y] mean? It’s a computation that, when completed, has two possible results.
One way of looking at this is the traditional notion of a non-deterministic Turing machine. Unlike a standard Turing machine, a non-deterministic Turing machine can choose to take more than one possible action at a time, branching into many copies of itself and finding all of the results from each possible action. This is actually where the complexity class NP comes from. NP stands for Non-deterministically polynomial, the class of problems that can be solved in polynomial time by a branching, self duplicating machine. This perspective gives us another viewpoint, if we only want one answer (or only care if the computation failed or succeeded). We can view a non-deterministic Turing machine as a Turing machine which chooses randomly, but infinitely lucky; it always chooses the right direction to go when given a choice.
Obviously, Haskell programs are usually run on deterministic (aka normal) Turing machines, so we don’t get the ability to solve NP complete problems in polynomial time with the list Monad. The program still has to explore every choice possible all the way to the end on a single non-duplicating computer, so we end up taking exponential time to solve NP problems.
But for problems where we have a set of choices, and we want to explore the solutions we get for all possible combinations of choices, the list monad works very well.
When we use do-notation in Haskell, this is what we’re telling the program:
= do sums <- [1, 2, 3, 4] x <- [-1, 1, 0] y return (x + y)
We’re saying “choose all possible values for
x, and all possible values for
y, and then give me back all possible values for
x + y.” This can also be interpreted as a pseudo-for loop. For each value
x in a list, and for each value
y in another list, compute
x + y. The only difference between an imperative for loop and this for loop is that this for loop has the idea of returning a value, instead of using side effects to construct another data structure. We can even use the Monadic combinator
guard :: Alternative f -> Bool -> f (), which is a function that takes a Boolean value and either fails or returns nothing, as a
break equivalent. 2 You might even notice that in the type for
guard, there’s a typeclass
Alternative, which is a typeclass for Monads (actually, any Applicative) that can represent fallible nondeterminism. 3
= do -- def inBothLists(xs, ys): inBothLists xs ys -- both =  <- xs -- for x in xs: x elem x ys) -- if x not in ys: guard (-- break return x -- both.append(x) -- return both
This behavior is mechanically derivable from the definition of
>>= for lists, but it’s unenlightening to stare at the proof. Instead, this behavior can be best understood as a fusion of the standard imperative for loop and the forking computation method described above.
List is also not the only nondeterministic Monad. Any data structure that contains multiple values is somehow nondeterministic. For example, the non-empty list (
data NonEmpty a = Cons a (NonEmpty a) | End a) represents a non-fallible nondeterministic monad. The infinite stream of values (
data Stream a = Cons a (Stream a)) represents a nondeterministic monad where each choice has to have infinite possibilities. The many-branched tree (
data Tree a = Leaf a | Branch [Tree a]) is a non-fallible nondeterministic monad that keeps track of how it got to each solution. And so on. It would probably be possible to write a Monad which creates a new thread on each choice, giving parallelism to these computations.
But the very fact that we understand lists as a Monad gives us so much power. We can use lists as nondeterminism during folds, maps, while looping and recursing, and even manually with
Compare this with other languages. Sure, Rust has Iterators and Java has Streams and Python has generators. But in those languages it’s very rare to see a
foldlM, probably because it would be used so rarely it’s better to just write a custom recursive function if you need it. But if you can take the Async and Maybe and List and Exception and State and IO (and …) Monads and abstract them away to one single function, then it might be worth writing.
And in some contexts, for some specific Monads, it still appears in some codebases. There’s a library of promise-related functions in the codebase where I work, and they have reimplemented
foldM in Java, specialized to Iterables (instead of any Foldable) and Promises (instead of any Monad), but with the exact functionality.
There are plenty already published on the Advent of Code subreddit↩︎
()is the zero-element tuple, also known as
Unit, which has exactly one value, and therefore is the least interesting type. Compare to the imperative
void, used for functions which don’t return anything.↩︎
The function we care about that allows us to write
empty :: Alternative f => f (). For lists,
empty = and for
empty = Nothing.↩︎