Free constructions #1 - Commutative Monoids & Multisets
Let’s take a binary operation that is associative and commutative, like addition on numbers.
In case you forgot, addition is associative because you can swap brackets around or leave them out altoghether without changing the result, like . And it is commutative because you can swap the order of the numbers too, like .
Let’s also remember that these operations have a special value called the identity element or unit, that doesn’t change the result when you plug it into your operation. For addition, the identity element is , because , and for multiplication, the identity element is , because .
We call this — a binary, associative, commutative, and unital operation like addition, on some set like numbers — a commutative monoid by the way. Addition on numbers is a commutative monoid for example, multiplication on numbers is another example. Or, so is , the commutative monoid of boolean values with the logical AND operation. Try it!
Now, say that instead of just calculating the result, we would like to keep track of what numbers were added together, so that we can reconstruct the result later.
For associative and commutative operations, the “smallest” possible datatype that can store that information is a multiset, which is like a regular set, but allows for multiple occurrences of the same element.
For example, the addition expression could be represented, in (almost) typescript, like so:
type Multiset<T> = Map<T, number>;
const multiset: Multiset<number> = new Map([
[1, 3],
[2, 1],
[3, 1],
[4, 2],
]);
That is, three ones, one two, one three, and two fours.
And we can then reconstruct the addition expression by iterating over the multiset and adding the numbers together, like so:
type BinaryOperation<T> = (a: T, b: T) => T;
const add: BinaryOperation<number> = (a, b) => a + b;
const evaluate = <T>(
multiset: Multiset<T>,
operation: BinaryOperation<T>,
initial: T
): T => {
let result = initial;
for (const [value, count] of multiset.entries()) {
for (let i = 0; i < count; i++) {
result = operation(result, value);
}
}
return result;
};
const result = evaluate(multiset, add, 0); // 16
I’ve also introduced a type BinaryOperation<T>
and
abstracted the addition away, since technically the multiset
doesn’t store any information about the operation we use to
evaluate it. The multiset for the expression
would
look exactly the same, and we could use the same evaluate
function and pass (a, b) => a * b
as the operation to get the
result (96).
That makes the multiset rather special! Like, given just the result of the addition (16), there would be no way to reconstruct the result of the multiplication (96) from it, because solving the addition problem destroyed information about the numbers that went in, in a way that the multiset does not.
And in a way the multiset doesn’t evaluate anything, it just keeps track of what is supposed to happen.
Not only that, we can also define a binary operation on multisets themselves that is both associative and commutative. Look at this:
const merge = <T>(a: Multiset<T>, b: Multiset<T>): Multiset<T> => {
const result = new Map<T, number>(a);
for (const [value, count] of b.entries()) {
result.set(value, (result.get(value) ?? 0) + count);
}
return result;
};
And ok, I will only do this once, but lets take our original addition problem again and move it into the realm of multisets:
// helper function to not go insane
const lift = <T>(value: T): Multiset<T> => new Map([[value, 1]]);
const expression = merge(
merge(
merge(merge(lift(1), lift(1)), merge(lift(1), lift(2))),
merge(lift(3), merge(lift(4), lift(4)))
)
);
const result = evaluate(expression, add); // 16
I’ve nested the merge calls rather arbitrarily, but that’s okay since our operation is associative, meaning bracketing doesn’t matter!
Exercises
-
Go ahead an convince yourself that
expression
and our original addition problem are equivalent. To do that, just replace each call tomerge
with+
, andlift
with nothing I guess. -
Convince yourself that the
merge
operation is associative. Do that by taking some multisetsa
,b
, andc
, and showing thatmerge(merge(a, b), c)
is the same asmerge(a, merge(b, c))
. -
Convince yourself that the
merge
operation is commutative. Do that by taking some multisetsa
andb
, and showing thatmerge(a, b)
is the same asmerge(b, a)
.
Final thoughts
There’s three things I’d like you to take away from this:
- We have found a way to decouple data from evaluation. We’ve
stated the addition problem in a way that just stores all the
data, but does not compute anything until we call
evaluate
. - The multiset is like the mother-of-all-commutative monoids.
Given a multiset, we can “move” into the addition commutative
monoid just by calling
evaluate
with the addition operation. And we can move into the multiplication commutative monoid by callingevaluate
with the multiplication operation. We can’t move from the addition commutative monoid to the multiplication commutative monoid this way, i.e. we can’t reconstruct 96 from 16.
And most importantly.
Not all types have well-known commutative monoids, like
numbers have addition or multiplication. You can’t add two
objects of type DatabaseRepository
or DogApple
, generally
speaking. But, given any type T
, we always know there is at
least one commutative monoid: Multiset<T>
.
This is why we call Multiset<T>
the free commutative
monoid on type T
.
We get it for free so to speak, it just always exists.
(This blog post is continued in part 2 - Monoids & Lists.)