An Elegant Stack Machine

Introduction
Lately I’ve been fascinated with simple models of computation.
It’s astounding to me that systems like the lambda calculus and SKI combinators, defined in just a few lines, can express any computable function.
But there’s one thing in particular that has really surprised me: how much you can achieve with categorical products.
In this article, we’ll explore how to build a Joy-like stack machine from simple primitives.
A Quick Note
The code in this article is written in Haskell, but it should be easy to follow even if you’re new to the language.
Here’s a primer:
Syntax For | Python | JavaScript | Haskell |
Function application | f(x, y) | f(x, y); | f x y |
Function composition | h = lambda x: g(f(x)) | const h = x => g(f(x)); | h = g . f |
For code examples, I’ll use λ>
to indicate REPL input, and show the printed output below it:
λ> "Hello, " ++ "World!"
"Hello, World!"
You can try Haskell here if you’re curious.
Products
We’ll start by defining a data type for the terminal object, Unit
:
data Unit = Unit
instance Show Unit where
show Unit = "()"
Which shows as follows:
λ> Unit
()
And a Product
type to represent the categorical product:
data Product a b = Product a b
instance (Show a, Show b) => Show (Product a b) where
show (Product a b) = "(" ++ show a ++ " >< " ++ show b ++ ")"
infixr 5 ><
(><) :: a -> b -> Product a b
(><) = Product
The last part defines a right-associative operator (><
), so we can use math-like syntax for \(a \times b\):
λ> 'a' >< 'b'
('a' >< 'b')
λ> 'a' >< 'b' >< 'c'
('a' >< ('b' >< 'c'))
Those of you familiar with Lisp may recognise this structure as an S-expression.
As McCarthy said in his seminal paper:
An S-expression is then simply an ordered pair…
We can can represent a list of arbitrary length in terms of S-expressions as follows.
The list
(m1, m2, ··· , mn)
is represented by the S-expression(m1 · (m2 · (···(mn · NIL)···)))
Here NIL is an atomic symbol used to terminate lists.
In the same spirit, we can think of our nested binary products as heterogeneous lists, with Unit
playing the role of NIL
.
The head of the list is the first element of the pair, and the tail is the second.
And if we can represent a list, we can represent a stack.
Projections
Categorical products come equipped with projections \(\pi_{1}\) and \(\pi_{2}\), which extract the first and second components of a pair respectively:
Source: Wikimedia Commons
Since we’re primarily thinking about lists, I’m going to use the names head
and tail
instead:
head :: Product h t -> h
head (Product h _) = h
tail :: Product h t -> t
tail (Product _ t) = t
λ> head ('a' >< 'b' >< 'c')
'a'
λ> tail ('a' >< 'b' >< 'c')
('b' >< 'c')
A well-formed list should terminate with Unit
.
We can still extract a head
and a tail
from a singleton list:
λ> head ('a' >< Unit)
'a'
λ> tail ('a' >< Unit)
()
For Fork’s Sake
I’m going to bastardise the product diagram so you can see where we’re up to:
We can represent a list as a product of a head and a tail.
We can extract the head of a list with the head
function, and the tail of a list with the tail
function.
What we haven’t touched upon yet is the magic triangle in the middle, which we call fork
.
For each object \(A\) and each pair of morphisms \(f : A \to H\) and \(g : A \to T\), there’s a unique morphism \(f \vartriangle g : A \to H \times T\).
I hope the naming convention makes it clear: fork
, in this context, represents list-forming functions.
Any list-forming function is a fork
.
So how do we express this in code?
fork :: (a -> h) -> (a -> t) -> a -> Product h t
fork f g a = f a >< g a
fork
is a higher-order function that takes as arguments:
A function
f
, which maps ana
to a headh
.A function
g
, which maps ana
to a tailt
.And an
a
.
Now you’re probably wondering what exactly a
could be.
In a simple case, we could pass two functions from Int
to Int
, and an integer a
:
λ> fork (+ 10) (* 3) 10
(20 >< 30)
And get back a Product
\(Int \times Int\).
But things gets really interesting when a
is itself a Product
.
fork
is a combinator that can take a list as input, and produce a list as output.
We can use it to construct everything else we need for our stack machine.
But before we do, let’s define an infix operator for fork
:
(&&&) = fork
Which I think reads wonderfully:
list = head &&& tail
A list is a head and a tail.
Push and Pop
Let’s define the two essential stack operations: push
and pop
.
pop
removes the top element of the stack, so it’s synonymous with tail
:
pop = tail
λ> pop ('a' >< 'b' >< Unit)
('b' >< ())
For push
, we could partially apply the ><
operator:
push a = (a ><)
But I did say everything else could be defined in terms of fork
, so let’s whet our appetite.
First of all, it’s helpful to think of stack functions as being polymorphic over the tail of the stack.
push
takes any a
, and any stack z
, and returns a >< z
.
We know that the fork
combinator gives us back a product h >< t
, so the head must be the value we’re pushing, and the tail must be the input stack.
push a = const a &&& id
You can read push
as forming a list where the head is the constant a
, and the tail is identical to the original list.
The const
function in Haskell relates to the classical K combinator: K x y = x
.
It ignores the second argument and returns the first.
In this case, it ignores the stack and returns the value a
.
λ> const 'a' ('b' >< Unit)
'a'
id
is the identity function (the classical I combinator), which is used here to preserve the input stack.
We can use push
as follows:
λ> push 'a' ('b' >< Unit)
('a' >< ('b' >< ()))
And of course, pop
after push
should leave the stack unchanged:
λ> pop . push 'a' $ ('b' >< Unit)
('b' >< ())
The Stack Combinator
A list is a head and a tail.
We will often want to transform it as such; applying one function to the head, and another to the tail.
The product map (***
) does exactly that:
(***) :: (a -> c) -> (b -> d) -> Product a b -> Product c d
(***) f g = (f . head) &&& (g . tail)
Notice that we haven’t introduced anything new - ***
is defined in terms of fork
.
A stack is a special type of list that can only be modified at one end.
While a list admits a pair of functions that may transform both the head and the tail, a stack function can only operate on the head of the stack.
To encode this property, we define a stack
combinator:
stack :: (a -> b) -> Product a z -> Product b z
stack f = f *** id
Which applies a function f
to the head of the stack, and can only preserve the tail.
The type system guarantees that the tail cannot change.
We can now apply unary functions with ease:
λ> stack (+ 10) (1 >< 2 >< Unit)
(11 >< (2 >< ()))
Binary Operations
Binary operations are a little bit trickier.
Traditionally, to add two numbers on a stack, you pop
two values and push
their sum.
In languages like Forth, you’ll often find an instruction called pop2
, which we can easily replicate:
pop2 = pop . pop
But I find it easier to think about this as taking the head of a stack, and the head of its tail.
I’ll use the following function names to make things more concise:
-- head of head, head of tail, ...
hoh = head . head
hot = head . tail
tot = tail . tail
toh = tail . head
To extract a pair of values from the stack, we need a function \(pair : (a \times (b \times c)) \to a \times b\).
Which we can define as:
pair = head &&& hot
λ> pair ('a' >< 'b' >< 'c' >< Unit)
('a' >< 'b')
How do we apply a binary operation to a pair?
To avoid having to define our own operators, let’s create a higher-order function that takes a curried Haskell function, and returns one that operates on an uncurried Product
:
uncurry :: (a -> b -> c) -> Product a b -> c
uncurry f (Product a b) = f a b
λ> uncurry (+) (1 >< 2)
3
The final step is to be able to apply a function to a pair on a stack, while preserving everything beneath it.
Remembering that pair
takes the head and the head of the tail:
pair = head &&& hot
We know the part that needs to be preserved is the tail of the tail.
Piecing it together, we have a disassociator function \((a \times (b \times c)) \to ((a \times b) \times c)\):
disassoc = pair &&& tot
And can now write a higher-order function called bop
:
bop f = stack (uncurry f) . disassoc
Which pairs the top two elements of the stack, and then applies an uncurried binary operation to that pair.
λ> bop (+) (1 >< 2 >< Unit)
(3 >< ())
Perfect!
Dup and Swap
Every self-respecting stack language has dup
and swap
:
$$\begin{align} dup &: a \to a \times a \\ swap &: a \times b \to b \times a \end{align}$$
The former duplicates the top element, and the latter swaps the order of the top two elements.
dup
is quite interesting because, like push
, it preserves the entire input stack.
We can define it as:
dup = head &&& id
λ> dup ('a' >< 'b' >< Unit)
('a' >< ('a' >< ('b' >< ())))
swap
on a pair could be defined as:
swap = tail &&& head
But on a stack it’s slightly more complicated.
We want to pair up the top two elements, swap them, and then unpair or flatten the stack again.
For this we need an associator function \(((a \times b) \times c) \to (a \times (b \times c))\), which is the inverse of disassoc
.
pair = head &&& hot
unpair = toh &&& tail
assoc = hoh &&& unpair
disassoc = pair &&& tot
Naturally, assoc
after disassoc
should leave the stack unchanged:
λ> assoc . disassoc $ ('a' >< 'b' >< 'c' >< Unit)
('a' >< ('b' >< ('c' >< ())))
We can now define swap
as:
swap = assoc . stack (tail &&& head) . disassoc
λ> swap ('a' >< 'b' >< Unit)
('b' >< ('a' >< ()))
Credits & Further Reading
The ideas in this article are inspired by the following works:
Calculating Functional Programs, by Jeremy Gibbons
Compiling to Categories and Calculating Compilers Categorically, by Conal Elliott
The Theory of Concatenative Combinators, by Brent Kerby
The Mathematical Foundations of Joy and The Algebra of Joy, by Manfred Von Thun
The core ideas pop up in several places under different names.
If you’re interested to see some other applications, have a look at the Bifunctor library and writings on Arrows.
Subscribe to my newsletter
Read articles from Alexander Codes directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by

Alexander Codes
Alexander Codes
Data Engineer ⌨️ | Pythonista 🐍 | Blogger 📖 You can support my work at https://app.nf.domains/name/alexandercodes.algo