Monads are becoming increasingly popular in the everyday programming practice. As a design pattern, (computational) monads are supported by (more or less pure) functional languages such as Haskell as well as by mainstream languages such us Scala.
Looking at monads in programming, it is almost inevitable to start talking about category theory (CT, for short), often leading to confusion and misunderstanding.
Of course monads in (functional) programming are in general different from monads in CT. Nevertheless, beginners and students in Computer Science tend to underestimate this fact and still use ideas and constructions coming from functional programming without having a proper awareness of the theoretical assumptions behind them.
In this post I want to focus on the confusion one usually finds when dealing with the mathematical counterparts of the so-called bind and join combinators. In fact, working in categories modelling specific programming languages, sometimes one tends (at least at first) to identify Haskell-like notions of bind as Kleisli extensions. This is the case in some familiar categories, like the category of sets and functions. However, ‘Haskell bind’ and Kleisli extensions are, in general, different concepts. This is easy to observe. Nevertheless, I found few references explicitly mentioned this fact. Most importantly, it is rarely touched the issue of what is a notion of bind from an appropriate mathematical perspective.
In this post I try to address such issue, and I do so using (very) basic notions of enriched category theory.
The language of enriched category theory provides a useful vocabulary to define a proper notion of bind and to stress the difference with the notion of Klesili extension. Moreover, enriched category theory is a rich and powerful framework for programming language theory, and it is becoming increasingly popular in the theoretical community. Therefore, it is a good idea to start getting familiar with it.
The intended audience for this post is made of students and beginners with some knowledge of basic category theory, but not so much experience in the field.
Monads in Category Theory and Programming
Let us start by looking at monads in Haskell. In Haskell a monad is defined as a type constructor together with the two combinators:
where reads as ‘bind’ (here I have slightly modified the type of which is often given as ). We follow the convention of writing in place of . Moreover, these two combinators are assumed to satisfy suitable equational laws (due to space constraints I will often omit equational laws/coherence conditions from definitions).
Equivalently, a monad can be given as a type constructors together with constructors
subject to suitable equational laws. In Haskell these two presentations are equivalent, meaning that given and (and their equational laws) it is possible define the combinators , , and . Note that we take the combinator (which is supposed to encode some notion of functoriality) as part of the definition of monad.
What is the relation between the above notion of monad and the usual notion of monad in CT (if any)?
Recall that given a category , a monad on is defined as a triple , where is an endofunctor on , (the unit) is a natural transformation with components , and (the multiplication) is a natural transformation with components . Moreover, and are required to satisfy suitable coherence conditions. The (informal) similarity with the combinators and is evident. Moreover, being an endufunctor, we find a `justification’ for including the combinator in the above list of combinators.
Monads can also be defined in a more ‘combinatorial’ way as Kleisli triples. A Kleisli triple is a triple consisting of:
- A map from objects of to objects of .
- For each object of , a morphism in .
- For each arrow in , an arrow in .
The arrow is called the Kleisli extension (or lifting) of , and thus we refer to the mapping as Kleisli extension. Note that Kleisli extensions are not not functions. These are maps from collections of arrows to collection of arrows, and these collections are not sets. We can follow Borceaux’s Handbook of Categorical Algebra and build CT on classes, thus assuming a ‘meta-level’ of classes to define maps from and to collections of objects and arrows.
Coming back to Kleisli triples, the analogy with the combinators and is evident. Following those analogies, the pairwise equivalences between definitions of monad, and the imprecise (and sometimes improper) terminology one is tempted to believe that every monad has a notion of bind. That is, it is taken as valid the following correspondence tables:
This is the case for instance in , the category of sets and functions, where it is possible to define (using, among other principles, extensionality) in terms of and viceversa (define as ).
However, having binds (although we still have to formally define them) is in general much stronger than having Kleisli extensions. In a slogan, a bind is an internalised Kleisli extension.
Monads and Enriched Category Theory
Enough with slogans: let us do some math! We know that the claimed categorical model behind Haskell, namely the ‘category’ Hask, is not a category (Andrej Bauer wrote a really nice post on that). So we will not work with Hask (that is simply not possible). Instead, we work with an arbitrary category requiring some minimal desiderata.
We use the vocabulary of enriched category theory since it perfectly suits our purposes.
Enriched CT can be motivated by the observation that in many situations the collection of arrows between two objects carries an interesting structure. For instance, in categories of domains (e.g. dcpos) the set of continuous functions from two domains and is usually a domain itself. It thus makes sense to move from categories in which hom sets are just sets to categories that have hom objects. That is, categories in which the ‘collection’ of arrows from two objects is not a set (or a class) but an object of a given category.
The resulting notion is that of a category enriched over an (ordinary) category . Formally, such underlying category must have enough structure to guarantee a suitable notion of composition and an identity arrow. For that it is enough to require to be a monoidal category.
In the following for a category we denote by and its underlying collections of objects and arrows, respectively.
Definition. Let be a monoidal category with tensor and unit . A category enriched over , or a -cat, consists of the following data:
- A collection of objects, denoted by .
- For each pair of objects in , an object in .
- For each object in , an arrow in .
- For each triple of objects in , an arrow in .
Again, we require suitable coherence conditions between the arrows and and the monoidal structure of , so to ensure that indeed and behave as the ‘identity arrow’ and the ‘composition operator’. Note that much in the same way objects need not be sets in ordinary CT (and thus we cannot speak of ‘elements of an object’), hom objects need not be sets in enriched CT (and thus we cannot speak of ‘elements of an hom object’–which, morally, would be nothing but arrows). Nevertheless, the monoidal structure of allows us to bypass this problem, since we can simulate a desired arrow in as a generalized element in , i.e. an arrow in .
Here are some examples of enriched categories:
- The category of locally small categories is enriched over .
- Any poset can be viewed as a category enriched over the truth value boolean algebra (regarded as a monoidal category with tensor product given by conjunction).
- A generalized metric space (i.e. a set equipped with a function satisfying the equality and the usual triangle inequality) can be viewed as a category enriched over the positive extended real line with addition as tensor product.
\item The last two examples were studied by Lawvere in his seminal paper Metric Spaces, Generalized Logic, and Closed Categories and led to the study of categories enriched over a quantale, which itself led to the field of monoidal topology.
- The category of sets and functions is enriched over itself.
The grammar of enriched CT shows us that cartesian closed categories are just categories enriched over themselves. This is of particular importance for our purposes. Let us say we want to model a programming language as a (syntactic) category . The objects of are the types of , whereas an arrow from to is just a program (or an equivalence class of programs) that takes an input of type and returns an output of type . The language is higher-order if programs are data, meaning that we have a type whose inhabitants are de facto arrows from to . That is, the category internalises its hom set (which, remember, need not be a set! That depends on whether the objects of are sets or not). Formally, is enriched over itself.
Example. As an example of a category enriched over itself let us consider the category of posets and monotone functions. It is easy to see that the set of monotone functions from to is a poset according to the pointwise order (i.e. iff for all , holds).The interesting part is that we can internalise composition. Define the function by , where denotes the product of posets. It is easy to prove that is monotone, meaning that composition can be internalised as an arrow in .
Although not strictly necessary, it is convenient to introduce the notion of enriched functor, so to acquire some confidence with `enriched reasoning’.
Definition. Let be categories enriched over . An enriched functor from to consists of a mapping from objects of to objects of together with an arrow in for any pair of objects in , satisfying suitable coherence conditions.
Here comes a crucial point. Let be enriched over itself, and consider an endofunctor on it. Then we have arrows in itself. The usual definition of functor for ordinary categories gives a mapping from arrows in to arrows in at the ‘meta-level’ (such a mapping is a ‘function’ in some universe of classes). If is enriched over itself, then provides its own ‘meta-level’, so to say. This is exactly what happens with Haskell’s (or actually what would happen if Haskell were a category).
Example. Let us consider an endofuctor on . We have already seen that is enriched over itself, so that for two posets and , is a poset and thus an object in . Now, is not an enriched functor (over ), in general. That would require the mapping to be an arrow in , and thus to be monotone. What we know is that for a monotone function , is monotone too, but that does not allow us to conclude that if in , then in .
With one last effort we can now come up with an enriched version of the notion of Kleisli triple.
Definition. Let be enriched over . An enriched Kleisli triple on consists of:
- A maping from objects of to objects of .
- For any object in , an arrow in .
- For all objects in , an arrow in .
As usual, we require these maps to obey suitable coherence conditions. Finally, we have our bind combinators. It’s just the enriched counterpart of Kleisli lifiting. That is, the bind operator is nothing but the Kleisli extension operator when is enriched over itself and is an enriched monad.
Note that having a bind operator is a much stronger condition than having Kleisli extensions. For instance, let us consider the category of -cpos with Scott continuous functions (i.e. functions preserving -chains). It is easy to see that is enriched over itself, so that is an -cppo, for -cppos. Let be a monad on . For any continuous function , we have a continuous function , meaning that the equality
holds. Requiring to be enriched would amount (among others) to require to be itself continuous, i.e.
Concluding, if you look at `categorical notions’ in functional programming (especially in the Haskell world) remember that usually there is an underlying (strong) assumption of self-enrichment. It is that assumption that allows improper analogies between Kleisli liftings and the bind combinators.
Of course what I wrote is just a way to see things. One could also be more `practical’ and simply observe that for a monad on a category , Kleisli lifting internalises (thus giving a bind arrow) whenever is strong and is cartesian closed. I think that working in programming language theory, the enriched perspective is quite useful and informative, and thus I decided to follow that approach.
I would like to thank Saverio Giallorenzo and Valeria Vignudelli for their useful suggestions.