A Short Note on Monads in Programming and Enriched Category Theory

Monads are becoming increasingly popular in the everyday programming practice.  As a design pattern, (computational) monads are supported by (more or less pure) functional languages such as Haskell as well as by mainstream languages such us Scala.

Looking at monads in programming, it is almost inevitable to start talking about category theory (CT, for short), often leading to confusion and misunderstanding.
Of course monads in (functional) programming are in general different from monads in CT. Nevertheless, beginners and students in Computer Science tend to underestimate this fact and still use ideas and constructions coming from functional programming without having a proper awareness of the theoretical assumptions behind them.

In this post I want to focus on the confusion one usually finds when dealing with the mathematical counterparts of the so-called bind and join combinators. In fact, working in categories modelling specific programming languages, sometimes one tends (at least at first) to identify Haskell-like notions of bind as Kleisli extensions. This is the case in some familiar categories, like the category of sets and functions. However, ‘Haskell bind’ and Kleisli extensions are, in general, different concepts. This is easy to observe. Nevertheless, I found few references explicitly mentioned this fact. Most importantly, it is rarely touched the issue of what is a notion of bind from an appropriate mathematical perspective.

In this post I try to address such issue, and I do so using (very) basic notions of enriched category theory.
The language of enriched category theory provides a useful vocabulary to define a proper notion of bind and to stress the difference with the notion of Klesili extension. Moreover, enriched category theory is a rich and powerful framework for programming language theory, and it is becoming increasingly popular in the theoretical community. Therefore, it is a good idea to start getting familiar with it.

Audience

The intended audience for this post is made of students and beginners with some knowledge of basic category theory, but not so much experience in the field.

Monads in Category Theory and Programming

Let us start by looking at monads in Haskell. In Haskell a monad is defined as a type constructor \hask{m} together with the two combinators:

    \begin{align*} \mathsf{return}\ & \mathsf{:: a \to m\ a} \\ \bind\ & \mathsf{:: (a \to m\ b) \to (m\ a \to m\ b)} \end{align*}

where \bind reads as ‘bind’ (here I have slightly modified the type of \bind which is often given as \hask{m\ a \to (a\ \to m\ b) \to m\ b}). We follow the convention of writing u \bind f in place of \bind(f)(u). Moreover, these two combinators are assumed to satisfy suitable equational laws (due to space constraints I will often omit equational laws/coherence conditions from definitions).
Equivalently, a monad can be given as a type constructors \hask{m} together with constructors

    \begin{align*} \hask{fmap}\ &\hask{:: (a \to b) \to (m\ a \to m\ b)} \\ \hask{return}\ &\hask{:: a \to m\ a} \\ \hask{join}\ &\hask{:: m\ (m\ a) \to m\ a} \end{align*}

subject to suitable equational laws. In Haskell these two presentations are equivalent, meaning that given \bind and \hask{return} (and their equational laws) it is possible define the combinators \hask{fmap}\hask{return}, and \hask{join}. Note that we take the combinator \hask{fmap} (which is supposed to encode some notion of functoriality) as part of the definition of monad.

What is the relation between the above notion of monad and the usual notion of monad in CT (if any)?

Recall that given a category \catone, a monad on \catone is defined as a triple \bm{T} = \langle \monad, \eta, \mu \rangle, where \monad is an endofunctor on \catone, \eta (the unit) is a natural transformation with components \eta_X: X \to \monad X, and \mu (the multiplication) is a natural transformation with components \mu_X : T(T X) \to TX. Moreover, \eta and \mu are required to satisfy suitable coherence conditions. The (informal) similarity with the combinators \hask{return} and \hask{join} is evident. Moreover, T being an endufunctor, we find a `justification’ for including the combinator \hask{fmap} in the above list of combinators.
Monads can also be defined in a more ‘combinatorial’ way as Kleisli triples. A Kleisli triple is a triple \langle T, \eta, \kleisli{(\cdot)}\rangle consisting  of:

  • A map T from objects of \catone to objects of \catone.
  • For each object X of \catone, a morphism \eta_X: X \to TX in \catone.
  • For each arrow f : X \to TY in \catone, an arrow \kleisli{f}: TX \to TY in \catone.

The arrow \kleisli{f} is called the Kleisli extension (or lifting) of f, and thus we refer to the mapping \kleisli{(\cdot)} as Kleisli extension. Note that Kleisli extensions are not not functions. These are maps from collections of arrows to collection of arrows, and these collections are not sets. We can follow Borceaux’s Handbook of Categorical Algebra and build CT on classes, thus assuming a ‘meta-level’ of classes to define maps from and to collections of objects and arrows.
Coming back to Kleisli triples, the analogy with the combinators \hask{return} and \bind is evident. Following those analogies, the pairwise equivalences between definitions of monad, and the imprecise (and sometimes improper) terminology one is tempted to believe that every monad has a notion of bind. That is, it is taken as valid the following correspondence tables:

This is the case for instance in \set, the category of sets and functions, where it is possible to define (using, among other principles, extensionality) \bind in terms of \kleisli{(\cdot)} and viceversa (define u \bind f as \kleisli{f}(u)).
However, having binds (although we still have to formally define them) is in general much stronger than having Kleisli extensions. In a slogan, a bind is an internalised Kleisli extension.

Monads and Enriched Category Theory

Enough with slogans: let us do some math! We know that the claimed categorical model behind Haskell, namely the ‘category’ Hask, is not a category (Andrej Bauer wrote a really nice post on that). So we will not work with Hask (that is simply not possible). Instead, we work with an arbitrary category \catone requiring some minimal desiderata.
We use the vocabulary of enriched category theory since it perfectly suits our purposes.

Enriched  CT can be motivated by the observation that in many situations the collection of arrows between two objects carries an interesting structure. For instance, in categories of domains (e.g. dcpos) the set of continuous functions from two domains D and E is usually a domain itself. It thus makes sense to  move from categories in which hom sets are just sets to categories that have hom objects. That is, categories in which the ‘collection’ of arrows from two objects is not a set (or a class) but an object of a given category.
The resulting notion is that of a category enriched over an (ordinary) category \quantale. Formally, such underlying category \quantale must have enough structure to guarantee a suitable notion of composition and an identity arrow. For that it is enough to require \quantale to be a monoidal category.
In the following for a category \catone we denote by \catone_0 and \catone_1 its underlying collections of objects and arrows, respectively.

Definition. Let \quantale be a monoidal category with tensor \tensor and unit I. A category \catone enriched over \quantale, or a \quantale-cat, consists of the following data:

  • A collection \catone_0 of objects, denoted by X, Y, Z, \hh.
  •  For each pair of objects X,Y in \catone_0, an object \catone(X,Y) in  \quantale_0.
  • For each object X in \catone_0, an arrow  J_X : I \to \catone(X,X) in \quantale_1.
  • For each triple of objects X,Y,Z in \catone_0, an arrow  M_{X,Y,Z}: \catone(Y,Z) \tensor \catone(X,Y) \to \catone(X,Z) in \quantale_1.

Again, we require suitable coherence conditions between the arrows M and J and the monoidal structure of \quantale, so to ensure that indeed J and M behave as the ‘identity arrow’ and the ‘composition operator’. Note that much in the same way objects need not be sets in ordinary CT (and thus we cannot speak of ‘elements of an object’), hom objects need not be sets in enriched CT  (and thus we cannot speak of ‘elements of an hom object’–which, morally, would be nothing but arrows). Nevertheless, the monoidal structure of \quantale allows us to bypass this problem, since we can simulate a desired arrow f in \catone(X,Y) as a generalized element in \quantale, i.e. an arrow f: I \to \catone(X,Y) in \quantale_1.

Here are some examples of enriched categories:

  • The category \Cat of locally small categories is enriched over \set.
  • Any poset P can be viewed as a category enriched over the truth value boolean algebra  \two (regarded as a monoidal category with tensor product given by conjunction).
  •  A generalized metric space (i.e. a set X equipped with a function d: X \times X \to [0,\infty]  satisfying the equality d(x,x) = 0 and the usual triangle inequality) can be viewed as a category enriched  over the positive extended real line [0,\infty] with addition as tensor product.
    The last two examples were studied by Lawvere in his seminal paper Metric Spaces, Generalized Logic, and Closed Categories and led to the study of categories enriched  over a quantale, which itself led to the field of monoidal topology.
  • The category \set of sets and functions is enriched over itself.

The grammar of enriched CT shows us that cartesian closed categories are just categories enriched over themselves. This is of particular importance for our purposes. Let us say we want to model a programming language \mathcal{L} as a (syntactic) category \catone. The objects of \catone are the types of \mathcal{L}, whereas an arrow from X to Y is just a program (or an equivalence class of programs) that takes an input of type X and returns an output of type Y. The language \mathcal{L} is higher-order if programs are data, meaning that we have a type \catone(X,Y) whose inhabitants are de facto arrows from X to Y. That is, the category \catone internalises its hom set (which, remember, need not be a set! That depends on whether the objects of \catone are sets or not). Formally, \catone is enriched over itself.

Example. As an example of a category enriched over itself let us consider the category \pos of posets and monotone functions. It is easy to see that the set \pos(P,Q) of monotone functions from P to Q is a poset according to the pointwise order (i.e. f \leq g iff for all p \in P, f(p) \leq g(p) holds).The interesting part is that we can internalise composition. Define the function M_{P, Q, R}: \pos(Q,R), \times \pos(P,Q) \to \pos(P,R) by M(f,g) = g \circ f, where \times denotes the product of posets. It is easy to prove that M_{P,Q,R} is monotone, meaning that composition can be internalised as an arrow in \pos.

Although not strictly necessary, it is convenient to introduce the notion of enriched functor, so to acquire some confidence with `enriched reasoning’.

Definition. Let \catone, \cattwo be categories enriched over \quantale. An enriched functor  from \catone to \cattwo consists of a mapping F from objects of \catone to  objects of \cattwo together with an arrow F_{X,Y} : \catone(X,Y) \to \cattwo(FX, FY)  in \quantale_1 for any pair of objects X,Y in \catone_0, satisfying suitable coherence  conditions.

Here comes a crucial point. Let \catone be enriched over itself, and consider an endofunctor F on it. Then we have arrows F_{X,Y}: \catone(X,Y) \to \catone(FX, FY) in \catone itself. The usual definition of functor for ordinary categories gives a mapping from arrows in \catone(X,Y) to arrows in \catone(FX, FY) at the ‘meta-level’ (such a mapping is a ‘function’ in some universe of classes). If \catone is enriched over itself, then \catone  provides its own ‘meta-level’, so to say. This is exactly what happens with Haskell’s \hask{fmap} (or actually what would happen if Haskell were a category).

Example. Let us consider an endofuctor F on \pos. We have already seen that \pos is enriched over itself, so that for two posets P and Q, \pos(P,Q) is a poset and thus an object in \pos_0. Now, F is not an enriched functor (over \pos), in general. That would require the mapping F_{P,Q}: \pos(P,Q) \to \pos(F P, F Q) to be an arrow in \pos_1, and thus to be monotone. What we know is that for a monotone function f : P \to Q, F f: F P \to F Q is monotone too, but that does not allow us to conclude that if f \leq g in \pos(P,Q), then Ff \leq Fg in \pos(FP, FQ).

With one last effort we can now come up with an enriched version of the notion of Kleisli triple.

Definition. Let \catone be enriched over \quantale. An enriched Kleisli triple on \catone consists of:

  • A maping T from objects of \catone to objects of \catone.
  • For any object X in \catone_0, an arrow \eta_X: I \to \catone(X, TX) in \quantale_1.
  • For all objects X,Y in \catone_0, an arrow \kleisli{(\cdot)} : \catone(X, TY) \to \catone(TX, TY) in \quantale_1.

As usual, we require these maps to obey suitable coherence conditions. Finally, we have our bind combinators. It’s just the enriched counterpart of Kleisli lifiting. That is, the bind operator is nothing but the Kleisli extension operator when \catone is enriched over itself and T is an enriched monad.

Note that having a bind operator is a much stronger condition than having Kleisli extensions. For instance, let us consider the category \ocppo of \omega-cpos with Scott continuous functions (i.e. functions preserving \omega-chains). It is easy to see that \ocppo is enriched over itself, so that \ocppo(X,Y) is an \omega-cppo, for X,Y \omega-cppos. Let T be a monad on \ocppo. For any continuous function f \in \ocppo(X,TY), we have a continuous function \kleisli{f} \in \ocppo(TX, TX), meaning that the equality 

    \[\kleisli{f}(\lub_{n < \omega} u_n) = \lub_{n < \omega} \kleisli{f}(u_n)\]

holds. Requiring T to be \ocppo enriched would amount (among others) to require \kleisli{(\cdot)} to be itself continuous, i.e. 

    \[\kleisli{(\lub_{n < \omega} f_n)} = \lub_{n < \omega} \kleisli{f_n}\]

to hold.

Conclusions

Concluding, if you look at `categorical notions’ in functional programming (especially in the Haskell world) remember that usually there is an underlying (strong) assumption of self-enrichment. It is that assumption that allows improper analogies between Kleisli liftings and the bind combinators.
Of course what I wrote is just a way to see things. One could also be more `practical’ and simply observe that for a monad T on a category \catone, Kleisli lifting internalises (thus giving a bind arrow) whenever T is strong and \catone is cartesian closed. I think that working in programming language theory, the enriched perspective is quite useful and informative, and thus I decided to follow that approach.

Acknowledgements

I would like to thank Saverio Giallorenzo and Valeria Vignudelli for their useful suggestions.

    \[\text{ \LARGE  \Coffeecup \Coffeecup \Coffeecup }\]

Leave a Reply