Day 1 of PL
Sun, Feb 25 2024 17:53:57 -0500

The very basics of lambda calculus

This was originally written some time in late December 2023, but edited February 25, 2024.

There are two main things you can think about when people talk about the overlap between linguistics and computer science. The first is unfortunately the much more popular field of large language models (I say this half-jokingly).

This second is programming language theory, full of stuff with esoteric names like context-free grammars, regular expressions, and abstract syntax trees. Also, honorable mention goes to the disturbingly-named pumping lemma.

Unfortunately, I’ve been having a bit of trouble finding a standardized college syllabus for programming languages. Each school seems to teach it in their own way, sometimes introducing a specific programming language as a backbone (like Ocaml in University of Illinois at Urbana-Champaign) or keeping things more conceptual before then introducing several different languages (like my school does).

So, to write these posts, I’m pulling material from all over different schools and different textbooks. I will of course be crediting them with some detail at the end of each post. When I’m done, I will probably compile a bibliography that contains every source plus extra reading material.

Disclaimer, you won’t need a math background for this. If you know what a function is, that’s basically enough.

…Ok, a function is a mapping from a set 1 to a set 2 such that every thing in set 2 is being mapped to (nothing is going untouched, so to speak) and everything in set 2 is mapped to at most once (so nothing is being touched more than once). These two traits are formally called surjectivity and injectivity, respectively.

What even is lambda-calculus?

Despite its name, it has nothing to do with differential/integral calculus.1 The most idiosyncratic trait of it is the lambda notation, which is basically another way to write out a mathematical function, like y = x + 1.

We define a function to be pretty much what you learn in high school, where:

  1. Some input is mapped to some output
  2. A given input can only have one output

These are implicitly the rules we defined as surjectivity and injectivity earlier. You might have learned about the latter via the vertical line test. Pretty tame so far.

A function you might write as f(x) = x + y or, f : x -> x + y, is written in lambda calculus like f = λx.x+y.

We can have multi-variate functions as well, like f(x,y) = x+y. This is written in lambda calculus like f = λxλy.x+y but for brevity, it is almost always simplified into f = λxy.x+y.

As we explore this new type of notation, we should keep a few rules in mind.

  1. Left associativity: operators and operands are grouped from left to right, e.g. if we have operands a, b, c and operator ⨁, the statement a ⨁ b ⨁ c will be evaluated as if it is written (a ⨁ b) ⨁ c.
  2. Parentheses can be removed if they only contain one “thing”

You might be thinking, what’s the point of lambda notation when the f(x) notation is way cleaner and more intuitive? To which my answer is: is it really, truly cleaner?

Lambda notation’s power lies not in writing functions that just take a variable as an input. It lies in writing higher-order functions. That is, functions that take in other functions as input. It’s going to be a lot simpler than trying to parse f(g(h(i(j(x))))) on the fly.

The beautiful core of functional languages arises from the notion that functions are just like any other value. We don’t treat them as special as we might do in a declarative language like C (you can’t input another function into a function in C, as far as I know…). It’s like an unconventional Swiss army knife. Don’t have for/while loops? No problem. With higher-order functions (and consequently, recursion), we won’t need ’em where we’re going.

Important reductions

  1. alpha-reduction
    • Sometimes called “alpha-replacement”. This is basically the notion of renaming variables so that when we evaluate abstractions, none of the variables clash.
  2. beta-reduction
    • Reflects the fact that every lambda-expression is, at its core, a search-and-replace command. That is, we take the inputs and replace the dummy variables in the body of the function with the values of the inputs.
  3. eta-reduction
    • A notion of equality between functions. Pretty intuitive: functions are only equal to each other if and only if they give the same outputs for all inputs.


  1. There’s so many false friends like this in math. Like how a random variable in probability is neither random nor a variable… ugh. Don’t get me started on that.↩︎