Understanding a catamorphism (by writing a simple one)
A lot of the time, people haven't heard of catamorphisms, or misunderstand what it is. Let's make it simple, and better yet, let's write one on our own!
In preparation for another post, I showed a draft of some TypeScript code to a colleague, and they were surprised that we could write a boolean a and b
as if a then b else false
, which sparked the idea for this post. If we didn't have a boolean
type in TypeScript, can we create our own? How would we then define operators like and
, or
, and not
? In this post, we'll explore a few things, ending up with a catamorphism
of a boolean
type.
If you've never heard of a catamorphism
before, don't worry, we explain that soon enough.
The easiest place to start with is a discriminated union. A discriminated union, also called a sum type, is a type which has a value of one of its several variant types. We use a field called a discriminant to know which variant we're dealing with. We can think of our Bool
type as a discriminated union, which has a value of one of its 2 types (variants): True
and False
. We can use a simple string as a discriminant.
NOTE: It's highly encouraged to follow along by typing the code into an editor, or via TS Playground
Types and constructors
So True
and False
are types, the union of which is another type Bool
. To create the types True
and False
, Let's create a simple type called Tagged
which has a specific literal string as a tag
inside an object. We can use this tag
as a discriminant. This is one of the ways to do that in TypeScript:
We have now mentioned that a Tagged
type has a generic type argument called Tag
which should be a string
- in our case it's a specific literal string. And then we have created a simple tagged
function that's generic on the Tag
to create a Tagged<Tag>
value.
We can use our tagged implementation to create 2 different tagged types, one for each variant of our discriminated union. We then create a union type for those:
We also create 2 values of type Bool
: a True
and a False
. Note that these are not typed as True
and False
, but as Bool
. This is because for a consumer of our public API we don't want there to be a differentiation between True
and False
, both of them are simply of type Bool
. All further functions we write will only work on Bool
. True
and False
are internal types, simply for us to create the union. To put another way, we could've defined Bool
without the explicit True
and False
types:
We use as Bool
because if we don't, TypeScript will infer the types to be more constrained for True
and False
, which we want to avoid. They're both Bool
values, for all intents and purposes.
Operators
We are all commonly used to seeing some boolean operators like and
, or
, and not
, so it only makes sense we're able to create them. To know what variant of a Bool
we're handling, we can use the discriminant tag
. Let's implement some operators.
And
We can say that an and
is a binary function that takes 2 Bool
s and returns a Bool
:
NOTE: This is a curried function. A curried function is a multivariate (having multiple variables) function that takes one argument at a time, returning other such functions until there are no more arguments left to take, at which point it returns the result. Most of the other functions I discuss in this post are also curried. This enhances reusability, composability, and expressiveness. If you have difficulties reading curried functions, please refer to my post on why I write curried functions.
When a
is true
, we can just return b
(which is a Bool
), and when a
is false
, we can return False
(which is also a Bool
). Note that we can also return a
instead of False
given a
is False
in the second case. This is the case where a and b
becomes if a then b else false
which I alluded to above. and
is also commutative (which means the ordering of arguments doesn't matter), so we can also swap the a
and the b
to write it as:
Or
Let's carry on, making an or
:
That should be simple to follow, now we say that when a
is true
, we just return True
(which is a Bool
) and otherwise, we return b
(which is a Bool
) - so we always return a Bool
, satisfying the or
logic. We also swap the order of arguments (or
is also commutative), but it works the same. Similar to the and
case where we could return a
when it's false
, we can also return a
when it's true
for the or
case. Instead, we return the value.
Not
Moving on to not
:
This is simple: we just swap the boolean values. Okay, we now have boolean constructors (values), and some operators, but how to use them?
Transformations
For our Bool
to be useful, it needs to be converted into other types. For example, let's say depending on whether it is True
or False
we need to make an action (execute a function). The function we call needs to be different based on whether it's true or false. In another scenario, we may want to display to the user the state that a switch is in, using our Bool
type: to say whether it's ON or OFF. This is a case where we want to convert our Bool
value into a string
, while the previous case warranted a conversion to a function. So in order to use our Bool
we need to be able to transform it to other types. And a catamorphism will allow us to do exactly that.
Catamorphism
There's several definitions of a catamorphism, but I'm going to stay away from theory. A lot of people think catamorphisms are like folds for linear/recursive structures, but it's more of a general concept. A catamorphism is a function that allows you to transform a type structurally into another type using consistent rules for the transformation. That should give us enough to go on. To recap, a catamorphism:
- allows transformation from a type to any other type,
- depends on the structure of the input type, and
- has consistent rules for the transformation.
The way the rules are consistent are important, but are more applicable for recursive structures. For now let's just say these are critical, and ensure we create a catamorphism that reflects the above.
Given that our catamorphism should allow a transformation to any other type, it has to be a generic function, with the resulting type as a type argument. What are the arguments to this function? There are 2 variants for which the resulting value in the specified type needs to be provided, so they become obvious arguments to the function. Then there's the case of the Bool
value itself that needs to be transformed. That's 3 arguments. What will be the ideal ordering of function arguments? We can apply the rule discussed earlier (in the linked post) to arrive at the ideal order.
For this we need to consider how the catamorphism will be used. Obviously the Bool
value is the last - since it's the data structure under manipulation, and also because it's possible that the same transformation will be applied multiple times to different Bool
values. What about the values for the 2 variants? Which one will come first? If it's not clear, I'd recommend going through the referenced post on the ordering of function arguments, and giving it a try. For this post, I'll be proceeding with the definition with the ideal order:
We first take the false
value, and capture its type. The true
value that comes next has to be of the same type. Then we finally take Bool
value and decide which of the 2 values to return, based on the variant type. If it's false we return the first, and if it's true we return the second value of the type.
Is our cata
a catamorphism?
We made some statements about a catamorphism before we started, so it's a good time to check whether our cata
implementation holds:
- allows transformation from a type to any other type - yes, because we made a generic function
- depends on the structure of the input type - yes, it depends on the structure of the
Bool
type - has consistent rules for transformation - yes, depending on the structure (true/false) there's deterministic and consistent output.
Great, so we've built ourselves a catamorphism! In order to see how the cata
function we defined earlier is useful, let's see a few sample transformations.
Converting to a status string
Earlier, we mentioned the case of converting a Bool
value that represents a switch, to a status string, and here's an implementation using cata
:
We may also want to share a different message when true
:
That's one case where the false
value being first helps in partial application:
Rewriting operators
A catamorphism is the simplest function needed for any type, as virtually any other function can be (but need not be) written on top of it. As an example, we can rewrite our and
using cata
:
While that looks surprisingly terse, that's exactly what our and
did earlier, now defined in terms of our cata
. Notice how the cata
can transform a Bool
into any other type, including Bool
itself! Recall cata
has been called with False
which fixes the Type
as Bool
, so the next argument that's expected is the true
value which has to be of type Bool
, and then it would take the Bool
to transform. Therefore, the signature of and
is the same as before:
Transforming into a boolean
Given we have a native boolean
type in TypeScript, and our cata
definition, it's extremely simple to write a transformation into a boolean:
Exercise
It's time to exercise. Can you write:
- a
fromBoolean
which converts aboolean
value into aBool
? - a
toBit
which converts aBool
intotype Bit = 0 | 1
? or
withcata
?not
withcata
?
Conclusion
You might've noticed that our cata
looks very similar to an if/then/else or a ternary (?:
) operator, with the operands reversed. I'd suggest to you that it is by no means an accident. The ternary operator is one of the best ways to make use of a boolean
type. No wonder our cata
closely reflects that in terms of usefulness. Can you think of why cata
is also a building block for something like an if
?
Given a boolean
primitive type already exists in TypeScript, what's the point of writing all this in the first place? The code here is merely an exercise to understand creating our own types and allowing transformations using catamorphisms. I primarily wanted to talk about a few things: that we can define our own boolean type in TypeScript, and that a catamorphism exists for such a boolean type, and that it is sufficient to create any further transformation, and that there are more than one catamorphisms for even types as simple as a boolean.
In a later post, I'd like to talk about a boolean type where members of the type are just functions (instead of values like above). I hope you're as excited as I am!