import marimo as mo
import polars as pl
from curryparty import L, o
Curryparty: a pinch of Lambda calculus
You're tired of Turing Machines ? Your functionnal programming friends make fun of you because you don't know what aWhat is this ?
This is a personnal project I created from scratch, to fulfil the very specific mission of teaching lambda-calculus in an interactive, visual way. The source code of the project is freely available on github, feel free to give me a star: https://github.com/rambip/curryparty Come with me, we're going on an adventure !Prerequisites
1) To appreciate this journey into the world of pure functions, you need to have some prior knowledge of what a function is, or at least can do. I don't have a great resource for that. If you don't know what a function is, go learn some python- (or any programming language of your choice that is at least 5 character long. That's a good rule of thumb to avoid imperative and overcomplex languages: C, C++, Java, Rust)
- reading the Church Turing thesis article in Standford Encyclopedia of Philosophy
- watching this video on The boundaries of comuptation
Now that you're introduced to the subject, let's go !
A world of functions
Let's start with some rules:
- No loop, no if / then / else
- You can only use "0" and "1", and the only thing you can do on strings is concatenation:
"a" + "b"to get"ab". - No list, no number, no operation (addition, substraction, multiplication and so on)
- You cannot communicate with the external world inside the functions. You can only use "print" at the very end of your code.
- You can use functions.
"0" + "1"
# Not very interesting ...
'01'
Challenge 0
Print0011 without using + more than 2 times.
Solution
def p0(x):
# add "0" before
return "0" + x
def p1(x):
# add "1" before
return "1" + x
p0(p0(p1(p1(""))))
This trick can seem strange, but you can do quite powerful things with it.
Here is a new trick: pass functions as argument. Here is one example to show what it allows you do to:
def p0(x):
return "0" + x
def p1(x):
return "1" + x
# second trick: pass functions instead of values
def four_times(f):
return f(f(f(f(""))))
print(four_times(p0))
print(four_times(p1))
0000 1111
Challenge 1
Write a functionboo that returns either 0 or 1, depending on the argument. Remember, no "if" and no arithmetic !
Solution
def true(a, b):
# left argument
return a
def false(a, b):
# right argument
return b
def boo(side):
return side("0", "1")
boo(false) # 0
boo(true) # 1
We could have called these functions
left and right. So why did I call these functions true and false ?
There is a good reason for that. Look at this mind-bending magic:def true(a, b):
return a
def false(a, b):
return b
def boo(side):
return side("0", "1")
def logical_not(x):
return x(true, false)
print(boo(logical_not(false)))
print(boo(logical_not(true)))
1 0
Challenge 2
Write alogical_or and a logical_and using the same structure as above.
Solution
def logical_or(a, b):
return a(a, b)
def logical_and(a, b):
return b(a, b)
print(
[
boo(logical_or(false, false)),
boo(logical_or(false, true)),
boo(logical_or(true, false)),
boo(logical_or(true, true)),
]
)
['1', '0', '0', '0']
print(
[
boo(logical_and(false, false)),
boo(logical_and(false, true)),
boo(logical_and(true, false)),
boo(logical_and(true, true)),
]
)
['1', '1', '1', '0']
Challenge 3
The functions ofchurch are defined in this way:
def one(f, x):
return f(x)
def two(f, x):
return f(f(x))
def three(f, x):
return f(f(f(x)))
...
product(a, b) that takes as input two "church number functions" and returns as many ones as the product of a and b.
Hint: you can use this function:
def n_ones(n):
def result(x):
return n(p1, x)
return result
Solution
def three(f, x):
return f(f(f(x)))
def four(f, x):
return f(f(f(f(x))))
def n_ones(n):
def result(x):
return n(p1, x)
return result
def mult(a, b):
return a(n_ones(b), "\")
mult(four, five, "\")
# result: 111111111111
If you're not used to functionnal programming, you may have noticed something strange in the "n_ones" function:
We created a function on the fly, used some of the current context inside it (in this case, the variable
What is this "lambda" keyword doing here ? Well, it allows to do exactly what we wanted: to create a function on the fly and to return it immediatly. That's the core idea of Lambda-calculus.
I hope you start to have a an intuition of what lambda functions are, and what they allow to do.
The next step is to define more formaly what these "lambdas" are.
Personnaly, the first time I learned about lambda calculus was in this video from Computerphile. Highly recommend it.
def n_ones(n):
def result(x):
return n(p1, x)
return result
n), and returned it immediatly. We gave it a name, but this name is completely arbitrary. We could have written it like this:
def n_ones(n):
return lambda x: n(o, x)
Terms, lambdas and applications
What we call a "Lambda-function" or ||(\lambda||) -function, is a mathematical description of the functions we played with in the last part.
A lambda-function is "something with variables" that "return something".
Let's create one:
L("x").o("x")
Here is how to read it:
But a lambda can have multiple variables:
- the blue block means "take a value"
- the gray line means "pass the value"
- the red block means "returns the value"
x and returns x.
In python, this would be:
def _lambda(x):
return x
return_first = L("x", "y", "z").o("x")
return_second = L("x", "y", "z").o("y")
return_third = L("x", "y", "z").o("z")
mo.hstack([return_first, return_second, return_third])
Let's focus on the first diagram.
It has multiple blue blocks, which means we take multiple things. To track them, let's give them names:
But there is another way to think about the machines we created. Let's focus on the last one. It's like a machine that takes x, and returns a new machine. This machines takes y, and returns yet another machine. Thist last machine takes z, and return z.
You may find it confusing to think about the lambda-functions we created in this way, but it's worth it. You can see the exact same phenomenon in python:
This process of transforming a function with 2 arguments into a function that returns a new function has a name. It's called Curryfication (in honor or Haskell Curry, another big name in the world of lambda-calculus)
So we can take stuff and return stuff. We can't do much yet. We're missing something: applying (or calling) functions. Let's see that in action:
- the first blue block (from the top) takes
x - the second blue block takes
y - the third blue block takes
z
x
From top to bottom, these functions are:
lambda x, y, z: x
lambda x, y, z: y
lambda x, y, z: z
def add(a, b):
return a+b
def add_curried(a):
def add_to_a(b):
return a+b
return add_to_a
# or equivalently
def add_curried(a):
return lambda b: a + b
add_curried(3)(4)
L("x", "f").o("f", "x")
Let's unpack it.
This lambda function takes a value
x (first blue block), then a function f (second blue bock), use the function f (left gray line) and pass the x value (right gray line) to it (black horizontal line).
For readability, the function that is called is under a black dot, and with a yello border.
That's it ! With these 3 ingredients (blocks that take arguments, blocks that return arguments, block that apply functinos), we can make any function we want. Before we go on, some terminology:
- a "Lambda", or
||(\lambda||) , is a blue block. It corresponds to a single variable. - a "Variable" is a red block. It is "bound" to a specific lambda, indicated by a gray line
- an "Application" is a horizontal black line. The function is the stuff inside the yellow border and the argument is at its right.
- a "Term" is any combination of the above. If the thing at the top is a Lambda, it's called a lambda-function.
Let's define the ||(\lambda||) -terms we already saw in the python challenges:
do_nothing = L("x").o("x")
do_nothing # it's also called the "identity" function
l_false = L("a", "b").o("a") # takes the left (first) argument
l_true = L("a", "b").o("b") # takes the right (second) argument
l_false
l_true
Remember the "Church" functions from the python challenges ? The concept comes from ||(\lambda||) -calculus, they are often called "Church numerals" (Church was the main creator of ||(\lambda||) -calculus).
As a reminder:
Look at the graphical representation below (use arrows to navigate). It's the same structure each time:
zero = lambda f, x: x
one = lambda f, x: f(x)
two = lambda f, x: f(f(x))
...
- take
f(blue block) - take
x(blue block) - pass
xthrough the functionfa given number of time.
zero = L("f", "x").o("x")
one = L("f", "x").o("f", "x")
two = L("f", "x").o("f", o("f", "x"))
three = L("f", "x").o("f", o("f", o("f", "x")))
four = L("f", "x").o("f", o("f", o("f", o("f", "x"))))
mo.hstack([zero, one, two, three])
Let's try something harder.
s_myst = L("n", "m", "f", "x").o("n", o("m", "f"), "x")
s_myst
This starts to look a bit abstract, but with a bit of practice, these diagrams can be read pretty easily. As an exercise, let's write the corresponding python function, just by looking at the diagram.
The first step is to create 4 lambdas:
Then, the rule of thum is: read bottom to top.
By looking at the 2 red squares at the right of the last line, we see that ||(\alpha = x_1(x_2(x_3))||)
The last step is to call ||(\alpha||) with the argument
In a textbook, this function would be written as:
||(\lambda x_1. \lambda x_2 . \lambda x_3 . \lambda x_4 . \quad x_1 (x_2 \; x_3) x_4||)
Exact same idea, except we place parenthesis differently for readability.
Below is an illustration. The "Application" nodes are represented with the ||(\circ||) symbol.
mysterious_function = lambda x1: lambda x2: lambda x3: lambda x4: ...
x2 is called with the argument x3. Then, x1 is called on the result. Let's call the expression mysterious_function = lambda x1: lambda x2: lambda x3: lambda x4: ... x1((x2(x3))) ...
x4. That is what the yellow border without a background means.
mysterious_function = lambda x1: lambda x2: lambda x3: lambda x4: x1((x2(x3)))(x4)
Aside: composing and forwaring
If we take a step back, we see that they are 2 ways to assemble a sequence of variables: composing and forwarding. In python, you can think about them this way:def compose(a, b, c, d, e, x):
return a(b(c(d(e(x)))))
def forward(a, b, d, e, x):
return a(b, c, d, e, x)
Composition
Forwarding
Now in our lambda calculus block representation:
compose = L("f", "g", "h", "x").o("f", o("g", o("h", "x")))
forward = L("f", "g", "h", "x").o("f", "g", "h", "x")
compose
forward
Cool right ?
mysterious function
Let's go back to our mysterious function:s_myst
It turns out that this function implements the multiplication of church numbers.
Let's try it out:
s_myst(three)(four)
Wait, what is this monstruosity ?
Well, it's a term. It does not have a ||(\lambda||) at the top, so it's clearly not a lambda function. It's still a term, a term that bundles together
s_myst, two and three.
But this not what we want. We want to know what is the result of this operation.
Right now, we can't. We did not specify how this is supposed to be transformed. We need a way to run it, like in python: a semantic.
Don't worry, I have implemented it. Let's jump directly to the result:s_myst(two)(three).reduce()
And indeed, we get 6.
But you may wonder (and you should): how did we go from the monstruosity to the nice "6" result ?
I spent weeks animating this stuff, so I hope you will enjoy it (for lack of understanding it, at least for now).
Click on the left and right arrow to navigate, click on the figure to animate.
mo.carousel([x.show_beta() or x for x in s_myst(three)(four).reduction_chain()])
click to animate, move away and back to reset
click to animate, move away and back to reset
click to animate, move away and back to reset
click to animate, move away and back to reset
click to animate, move away and back to reset
click to animate, move away and back to reset
click to animate, move away and back to reset
click to animate, move away and back to reset
click to animate, move away and back to reset
click to animate, move away and back to reset
The beta reduction
As one dives deeper into the formalism and the internals of lambda-calculus, it's easy to get lost.
To make it easier, it's fundamental to hold back to what we know already. And what do we know already: how python functions work ! Or at least, we can have a pretty good idea.
So let's take an example, and let's think about "how would python calculate it"
Here is the term we will analyse:
||[
\lambda n \lambda f \lambda x. f (n f x)
||] If you want, click below to know what this function compute. If you prefer trying to guess it as we detail each step of the calculation, don't.
Spoiler: what does this function compute ?
This function is the successor function. It takes a church numeral (e.g two =s = L("n", "f", "x").o("f", o("n", "f", "x"))
stuff_to_compute = s(one)
stuff_to_compute
and here is the python version:
s_py = lambda n: lambda f: lambda x: f(n(f)(x))
result_py = lambda f: lambda x: f(x)
# TODO: compute `s_py(one_py)`
Challenge 4
Try to find what the first step is for python to computesucc_py(one_py)
Solution
To computesucc_py(one_py), the key is to replace the variable n in succ_py
Since we pass one_py as the argument, we must replace n by one_py.
Let's do it:
# before
lambda n: lambda f: lambda x: f(n(f)(x))
# after
lambda f: lambda x: f(( lambda f: lambda x: f(x) )(f)(x))
stuff_to_compute.show_beta()
click to animate, move away and back to reset
Now, we have a new expression. Let's rewrite it here:
||[
\lambda f \lambda x. f\big((\lambda f \lambda x. f(x)) f x\big)
||] If you prefer the python version:
There is something spooky going on here:
It's easy to get lost and to rename the wrong thing. Hopefully, our graphical representation do not have this problem. Instead of having to "rename", we have to "reconnect" the variables to the right block.
||[
\big(\lambda f. f(f(f))\big)(\blue{\lambda x \lambda y .x})
||] When we apply the first operation, we get this (quite long) expression:
||[
(\blue{\lambda x \lambda y .x})\big((\blue{\lambda x \lambda y .x})(\blue{\lambda x \lambda y .x})\big)
||] But ||(x||) and ||(y||) don't have the same role in the first group, in the second group and in the third group. So we could rename it, just like the previous example:
||[
(\blue{\lambda x1 \lambda y1 .x1})\big((\blue{\lambda x2 \lambda y2 .x2})(\blue{\lambda x3 \lambda y3 .x3})\big)
||] Now compare this with the graphical representation:
lambda f: lambda x: f(lambda f: lambda x: f(x))(f)(x)
lambda f twice and lambda x twice. We can apply a trick: renaming
lambda f: lambda x: f(lambda f1: lambda x1: f1(x1))(f)(x)
Aside: renaming
Let's consider this expression:_a = L("f").o("f", "f", "f")
_b = L("x", "y").o("x")
_a(_b)
We don't have to rename anything, because the blue blocks (corresponding to ||(\lambda x \lambda y||) ) are duplicated, and the new red blocks (the variables) are connected to the right blue blocks.
Challenge 5
Let's go back to our expression. After the first step, we arrived at:lambda f: lambda x: (lambda f1: lambda x1: f1(x1))(f)(x)
# it's a bit hard to read, so let's rewrite it:
def lambda_(f):
def lambda_(x):
inside_block = lambda f1: lambda x1: f1(x1)
return f(inside_block(f)(x))
return lambda_
Solution:
lambda f: lambda x: f(f(x))
mo.carousel([x.show_beta() or x for x in stuff_to_compute.beta().reduction_chain()])
click to animate, move away and back to reset
click to animate, move away and back to reset
Congratulations ! You just understood how lambda calculus works.
Let's give a name to the operation used to go from one term to the next: beta-reduction.
We note it like this:
||[
\big (\green{\lambda n} \lambda f \lambda x. f(\green{n} f x) \big)(\blue{\lambda f \lambda x .f(x)})
\rightarrow_{\beta}
\lambda f \lambda x f((\blue{\lambda f \lambda x. f(x)}) f x)
||] To discuss beta-reduction even more precisely, we need to introduce a little more terminology.
Let's take this expression:
||[\big (\green{\lambda n} \lambda f \lambda x. f(\green{n} f x) \big)(\blue{\lambda f \lambda x .f(x)})||] The entire formula is called a redex (the stuff we want to reduce). The ||(\green{\lambda n}||) is the root-lambda of the redex, and ||(\blue{\lambda f \lambda x. f(x)}||) is the argument of the redex.
Each ||(\green n||) appearing in the term is an occurence "bound to the root-lambda".
To recap, the algorithm is:
- find a redex of the form
||((\green \lambda \green x E)(\blue{A})||) - replace every occurence of
||(E||) that is bound to the root lambda with a copy of||(\blue A||) - each time you replace, rename all the variables that you need to rename (*)
- remove
||(\green{\lambda x}||) and||(\blue A||)
Note: I did not detail all the cases for the renaming step. It's subtle: you have to rename a variable only if it's bound to the root lambda of the argumentThere is an important thing to note here: I said find a redex and not take the redex. We have already seen this in action, but you may not have paid attention. Look back at:||(\blue A||) , because you don't want to change variables that appear higher in the tree.
stuff_to_compute.beta().beta()
In the graphical representation, the redexes appear in blue with a yellow border. (this makes sense: it's a lambda (blue) with an argument (yellow)).
Here, the redex is not at the top of the expression, it's somewhere inside.
But it can be even worse, for example:
_e = L("y").o(L("x").o("x"), "y")
L("f").o("f", _e, _e)
Here, we have 2 potentials redex to chose from to do our lambda-reduction. Which one do we chose ?
It does not matter. Well, it does matter, but ... it does not really matter.
Let me introduce: termination and the Church-Rosser Theorem
Termination and the Church-Rosser theorem
All the terms I used from the beginig were really nice: they simplified (more precisely, beta-reduced) until they reached a very simple form: a form with no potential redex. Is it always the case ? It might surprise you, but no. There is even a simple expression that never stop reducing. Here it is:Challenge 6: Try to reduce this expression, and see what happens.
Solution
Now for the animated version:
omega = L("f").o("f", "f")
omega(omega).show_beta()
click to animate, move away and back to reset
And it's not just something funny in our rules: the exact same thing appens in python (click on the button to run):
if bomb_button.value:
(lambda f: f(f))(lambda g: g(g))
Time for a recap:
-
A lambda-term can have zero, one or multiple "redexes" of the form
||((\green{\lambda x}. E)(\blue A)||) - if it has zero redex, we can't reduce it: we're done. Let's call it the "final form"
-
We can "beta-reduce" a term
- start by chosing a redex if they are multiple
- apply the procedure to transform the redex. You get a new term
-
Some terms are "nice". A "nice" term:
- has a single redex
- reduces to a term that has a single redex, and we can keep reducing ...
- reaches a "final form" if we reduce enough times
-
This means that:
- some terms are not nice because they have multiple redexes
- some terms are not nice because they never reduce to a final form.
Church-Rosser theorem
The Church-Russer Theorem states that if you take a term, and start applying beta-reductions in two different ways, you can't find completely different results. More precisely, if Alice applies some number of reductions until she reaches A, and if Bob applies some other number of reductions until he reaches B, there is a way to keep applying reductions to A and B until they become exactly the same. It DOES NOT MEAN that you can pick the beta-reductions at random. It's possible that Alice is smart and find a way to reduce to a final form, and Bob is dumb and keep reducing forever, because he chose the wrong regex each time. But at each step, if Bob becomes smart, he can reach a final form. It DOES mean that there is only one possiblef final form. We call it the normal form Below is an example where the choice of redex matters.Challenge 7
find a strategy to reduce to a normal form, and a strategy to keep reducing forever.l_true(omega(omega))(one)
Solution
If you start by reducing the outer redex two times, you will drop the big expression in the middle.The leftmost-outermost reduction is normalizing
This looks scary, but it is simpler than the first theorem. This theorem states that the smart way to reduce is to take the first redex. leftmost-outermost is a fancy way to say "the first lambda you see" If you use this reduction each time, you know that you will get the final form IF IT EXISTS. This is the reduction I implemented in the graphical animations on this page. For example, we can see that the redution of the previous expression (challenge 7) reaches the final form:mo.carousel(
[x.show_beta() or x for x in l_true(omega(omega))(one).reduction_chain()]
)
click to animate, move away and back to reset
click to animate, move away and back to reset
Turing-completeness and the halting problem
Now for the really cool stuff. How do we know if a term will reach its normal form (in this case, we say the reduction "terminates") ? Well, we can't ! The only way is to keep applying theBonus content
This part if for the lambda-calculus hypsters.pred = L("n", "f", "x").o(
"n",
L("g", "h").o("h", o("g", "f")),
L("u").o("x"),
L("u").o("u"),
)
pred
pred(two).reduce()
pair = L("a", "b", "x").o("x", "a", "b")
first = L("x").o("x", l_false)
second = L("x").o("x", l_true)
pair
sum = L("n", "m", "f", "x").o(o("n", "f"), o("m", "f", "x"))
sum(four)(three).reduce()
fib_pair = (
L("p").o(pair, o(sum, o(first, "p"), o(second, "p")), o(first, "p"))
).reduce()
fib_pair
five = succ(four)
five(fib_pair)(pair(one)(zero)).reduce()
fact = L("n", "f").o(
"n",
L("f", "n").o("n", o("f", L("f", "x").o("n", "f", o("f", "x")))),
L("x").o("f"),
L("x").o("x"),
)
fact
fact(four).reduce()
# the Y combinator
y = L("f").o(
L("g").o("f", o("g", "g")),
L("g").o("f", o("g", "g")),
)
y
mo.carousel([x for x in fact(three).reduction_chain()])
div2 = (
L("n")
.o(
"n",
L("p").o(
o(second, "p"),
o(pair, o(first, "p"), l_true),
o(pair, o(succ, o(first, "p")), l_false),
),
o(pair, zero, l_false),
)
.reduce()
)
div2
# divide three by 2 in 51 steps
mo.carousel([x for x in div2(three).reduction_chain()])
pack = L("f", "p").o("f", o(first, "p"), o(second, "p")).reduce()
unpack = L("f", "a", "b").o("f", o(pair, "a", "b"))
pack
len(list((div2(succ(succ((four))).reduce()).reduction_chain())))
154
div2(four).reduce()
mult = L("n", "m", "f", "x").o("n", o("m", "f"), "x")
succ = L("n", "f", "x").o("f", o("n", "f", "x"))
is_not_zero = L("n").o("n", L("f").o(l_true), l_false).reduce()
syracuse = y(
L("F", "n").o(
o(
L("half_and_reminder").o(
o(is_not_zero, o(first, "half_and_reminder")),
one,
o(
pair,
"n",
o(
"F",
o(
o(second, "half_and_reminder"),
# even: n / 2
o(first, "half_and_reminder"),
# odd: 3 * n + 1
o(succ, o(mult, three, "n")),
),
),
),
),
o(div2, "n"),
)
)
)
syracuse
# run at your own risks ...
# syracuse(five).reduce()