> "A function is a black box that turns an input into exactly one output. Everything else is detail — but the detail is where the power is."
Prerequisites
- 8
Learning Objectives
- Define a function formally as a special relation, and state its domain, codomain, and range precisely.
- Compute the image of a set and the preimage of a set under a function, distinguishing preimage from inverse.
- Classify a function as injective, surjective, or bijective, and prove or disprove each property.
- Compose functions correctly, and determine when a function is invertible by proving it is a bijection.
- Apply the standard functions floor, ceiling, mod, exponential, and log, and connect functions to hashing, pure functions, and type signatures in code.
In This Chapter
- Overview
- Learning Paths
- 9.1 Functions as relations with a rule
- 9.2 Domain, codomain, and range
- 9.3 Injective, surjective, bijective
- 9.4 Composition and inverses
- 9.5 Important functions: floor, ceiling, mod, exponential, log
- 9.6 Functions in computer science: hashing, pure functions, type signatures
- Project Checkpoint: a finite-function module for the Toolkit
- Summary
- Spaced Review
- What's Next
Chapter 9: Functions
"A function is a black box that turns an input into exactly one output. Everything else is detail — but the detail is where the power is."
Overview
You have been writing functions since your very first program. def square(x): return x * x takes an
input and hands back an output, and you never wondered whether it was "well defined" — it just was.
That everyday confidence hides a precise mathematical object, and making it precise pays off
immediately. When you ask whether a hash function spreads keys evenly, whether a password-hashing
scheme can be reversed, whether two data formats carry the same information, or whether a type
signature can ever be satisfied, you are really asking questions about functions in the
mathematician's sense. This chapter gives you the vocabulary and the proof techniques to answer them.
A mathematical function is not a block of code; it is a rule of correspondence that assigns to each
input exactly one output. That single restriction — exactly one output per input — is what
separates a function from an arbitrary relation, and it is the reason dict lookups, array indexing,
and pure functions behave the way they do. Once we pin the definition down, three questions turn out to
matter again and again: does the function ever collide (map two inputs to one output)? does it reach
everything (hit every possible output)? and can it be undone? Those three questions are the properties
injective, surjective, and bijective, and they are the spine of this chapter.
The payoff is large and immediate. Bijections — perfect one-to-one pairings — are how we will compare the sizes of infinite sets in Chapter 10, and they are why some problems can have no program that solves them. Functions that cannot be inverted are the entire premise of cryptographic hashing. Functions that collide are why hash tables need collision handling at all. By the end of this chapter, "is this map injective?" will feel as concrete to you as "does this loop terminate?"
In this chapter, you will learn to:
- See a function as a special kind of relation — a set of ordered pairs obeying one rule — and pin down its domain, codomain, and range.
- Compute the image of a set and the preimage of a set, and avoid the classic confusion between a preimage and an inverse.
- Decide whether a function is injective, surjective, or bijective, and prove each claim rigorously.
- Compose functions in the right order, and prove that a function is invertible exactly when it is a bijection.
- Use floor, ceiling, mod, exponential, and logarithm fluently, and connect all of this to hash functions, pure functions, and the type signatures you read every day.
Learning Paths
🏃 Fast Track: If "injective / surjective / bijective" already trips off your tongue, skim 9.1–9.3 and spend your time on 9.4 (composition and invertibility — the one theorem most people get fuzzy on) and 9.6 (the CS payoff: hashing, pure functions, type signatures). Do the ⭐⭐ and ⭐⭐⭐ exercises and the Project Checkpoint.
📖 Standard Path: Read in order. Work every
🔄 Check Your Understandingbefore continuing, and try to state each property's definition from memory before you read ours. Prove the invertibility theorem yourself in 9.4 before reading the proof.🔬 Deep Dive: After the chapter, prove the finite "pigeonhole" theorem of 9.3 (for finite sets of equal size, injective $\Leftrightarrow$ surjective $\Leftrightarrow$ bijective) and then read ahead to see how Chapter 10 removes the equal-size hypothesis to compare infinite sets. The exercise set's Part E is built for this.
9.1 Functions as relations with a rule
Start with something you already trust: a lookup table.
| name | score |
|---|---|
"ana" |
91 |
"ben" |
88 |
"cy" |
91 |
This table assigns to each name exactly one score. Ask "what is "ben"'s score?" and there is one
answer, 88. Ask "what is "cy"'s score?" and there is one answer, 91 — even though "ana" also scores
91, which is allowed (two inputs, same output). What is not allowed is for "ben" to have two scores
at once; then the table would be useless as a lookup. That single discipline — one output per input —
is exactly what makes this object a function.
In Chapter 8 you learned that the Cartesian product $A \times B$ is the set of all ordered pairs $(a, b)$ with $a \in A$ and $b \in B$, and you will see in Chapter 12 that any subset of $A \times B$ is called a relation. A function is a relation with one extra promise.
🔗 Connection. This chapter sits directly on Chapter 8. The objects below — ordered pairs, Cartesian products, set membership, cardinality of finite sets — are all Chapter 8's. We do not redefine them; we use them. If "ordered pair" or "$A \times B$" feels shaky, reread §8.4 first.
Definition (function). Let $A$ and $B$ be sets. A function $f$ from $A$ to $B$, written $f\colon A \to B$, is a subset $f \subseteq A \times B$ such that for every $a \in A$ there is exactly one $b \in B$ with $(a, b) \in f$. We write $f(a) = b$ for that unique $b$, and call $b$ the image of $a$ (or "the value of $f$ at $a$").
Unpack the phrase "exactly one," because both halves do work:
- At least one (totality): every input in $A$ gets some output. There are no gaps — no $a \in A$ for which $f(a)$ is undefined.
- At most one (well-definedness / single-valuedness): no input gets two outputs. There is no $a$ with both $(a, b)$ and $(a, b')$ in $f$ for $b \neq b'$.
💡 Intuition: A function is a machine with a deterministic dial. Set the dial to an input, read the output. Same input, same output, every time (that is well-definedness). And the dial never jams — every legal input produces an output (that is totality). A pure function in your favorite language is exactly this machine: no hidden state, no surprises.
Functions are everywhere in code
On a finite domain, a function literally is a dictionary: the keys are the domain, the values are the outputs, and the "exactly one output per input" rule is enforced for free, because a dict cannot store two values under one key.
domain = [-2, -1, 0, 1, 2]
# Build the function as an explicit set of (input, output) pairs, then as a dict.
square_pairs = {(x, x * x) for x in domain}
square = {x: x * x for x in domain}
print(sorted(square_pairs))
print(square[2])
print(len(square) == len(domain)) # one output per input => sizes match
# Expected output:
# [(-2, 4), (-1, 1), (0, 0), (1, 1), (2, 4)]
# 4
# True
The set square_pairs is the function as a mathematician sees it — a set of ordered pairs. The dict
square is the same function as a programmer uses it. The last line is a sanity check that mirrors
the definition: a function on domain has exactly one pair per input, so the number of entries equals
the number of inputs. This is theme one of the book in miniature — discrete math is the language of
CS: a dict is not like a function, it is one.
⚠️ Common Pitfall: "A formula and a function are the same thing." They are not. The same formula $x \mapsto x^2$ gives different functions on different domains: on $\mathbb{R}$ it is one function, on $\{-2,\dots,2\}$ another, on the natural numbers another still. A function is the rule plus its domain and codomain. Two functions are equal only when they agree on all three. We make this official in 9.2.
🔄 Check Your Understanding 1. Which of these define a function from $A = \{1,2,3\}$ to $B = \mathbb{Z}$? (a) ${(1,5),(2,5), (3,9)}$; (b) ${(1,5),(2,7)}$; (c) ${(1,5),(1,6),(2,7),(3,8)}$. 2. In the table at the start of this section, two names map to 91. Does that violate the function rule? Why or why not?
Answers
- Only (a). (b) fails totality — input 3 has no output. (c) fails single-valuedness — input 1 has two outputs (5 and 6). 2. No. The rule forbids one input having two outputs; it says nothing about two inputs sharing an output. (Whether that is allowed is the question of injectivity, in 9.3.)
9.2 Domain, codomain, and range
A function comes with three sets, and beginners routinely blur the last two. Keeping them straight is the difference between a correct proof and a stuck one.
Definition (domain, codomain, range). For a function $f\colon A \to B$: - $A$ is the domain — the set of allowed inputs. - $B$ is the codomain — the declared set the outputs are drawn from. - The range (or image of $f$) is the set of outputs actually produced: $$\operatorname{range}(f) = \{\, f(a) \mid a \in A \,\} \subseteq B.$$
The codomain is a promise about the type of the output; the range is the truth about which values occur. They need not be equal. Take $f\colon \mathbb{R} \to \mathbb{R}$, $f(x) = x^2$. The codomain is all of $\mathbb{R}$ (we declared outputs are real numbers), but the range is only $[0, \infty)$ — no negative number is ever a square. The gap between codomain and range is precisely what "surjective" will measure in 9.3.
💡 Intuition: Think of a type signature. In
def f(x: int) -> int:, the domain isint, and the codomain is the return typeint. The range is whatever subset ofintthe body offcan actually return — which the type system does not track. The compiler checks the codomain; only a proof (or exhaustive testing) reveals the range.
Image of a set and preimage of a set
We will constantly push whole sets through a function, forward and backward. Two operations capture this. Both produce sets, and the second one is the source of a notorious confusion.
Definition (image of a set, preimage of a set). Let $f\colon A \to B$, let $S \subseteq A$, and let $T \subseteq B$. - The image of $S$ under $f$ is $f(S) = \{\, f(a) \mid a \in S \,\}$ — push $S$ forward. - The preimage (or inverse image) of $T$ under $f$ is $$f^{-1}(T) = \{\, a \in A \mid f(a) \in T \,\}$$ — pull $T$ back to every input that lands in it.
The notation $f^{-1}(T)$ is unfortunate, because it looks like it requires an inverse function. It does not. The preimage is defined for every function, invertible or not. It simply collects all inputs whose output falls in $T$.
⚠️ Common Pitfall: $f^{-1}(T)$ is a set, not an inverse. Writing $f^{-1}(\{4\})$ does not assume $f$ can be inverted. For $f(x) = x^2$ on the integers, $f^{-1}(\{4\}) = \{-2, 2\}$ — a set of two inputs, even though $f$ has no inverse function (it collides). Reserve the phrase "the inverse function $f^{-1}$" for the bijective case in 9.4; until then, $f^{-1}(T)$ always means "the set of inputs landing in $T$."
domain = [-2, -1, 0, 1, 2]
f = lambda x: x * x
range_f = {f(x) for x in domain}
S = {1, 2}
image_S = {f(x) for x in S}
T = {1, 4}
preimage_T = {x for x in domain if f(x) in T}
print(sorted(range_f))
print(sorted(image_S))
print(sorted(preimage_T))
# Expected output:
# [0, 1, 4]
# [1, 4]
# [-2, -1, 1, 2]
Read the outputs against the definitions. The range is $\{0,1,4\}$ — a strict subset of the codomain $\mathbb{Z}$, confirming $f$ is "not surjective onto $\mathbb{Z}$." The preimage of $\{1,4\}$ has four elements because $f$ collides ($1$ and $-1$ both square to $1$; $2$ and $-2$ both square to $4$). That the preimage of a small set can be large is the fingerprint of a non-injective function — and, foreshadowing 9.6, the fingerprint of a hash collision.
🔄 Check Your Understanding 1. For $f\colon \mathbb{R} \to \mathbb{R}$, $f(x) = 3x + 1$, what is the range? Is it equal to the codomain? 2. Let $f(x) = x \bmod 3$ on the domain $\{0,1,2,\dots,8\}$ with codomain $\{0,1,2\}$. Compute the preimage $f^{-1}(\{0\})$. 3. True or false: for any $f$ and any $T \subseteq B$, the preimage $f^{-1}(T)$ exists even if $f$ is not injective.
Answers
- The range is all of $\mathbb{R}$ (every real $y$ is hit by $x = (y-1)/3$), so here range = codomain.
- $f^{-1}(\{0\}) = \{0, 3, 6\}$ — the inputs divisible by 3. 3. True. The preimage is defined for every function; it is just the set of inputs whose output lies in $T$.
9.3 Injective, surjective, bijective
Now the three properties that organize everything. Each answers a yes/no question about a function's behavior, and each has a clean definition, a one-line intuition, and a direct test.
Definitions (the three properties). Let $f\colon A \to B$. - $f$ is injective (one-to-one) if distinct inputs always give distinct outputs: $$\forall a_1, a_2 \in A,\quad f(a_1) = f(a_2) \;\rightarrow\; a_1 = a_2.$$ Equivalently (contrapositive): $a_1 \neq a_2 \rightarrow f(a_1) \neq f(a_2)$. No collisions. - $f$ is surjective (onto) if every codomain element is hit: $$\forall b \in B,\ \exists a \in A,\quad f(a) = b.$$ Equivalently: $\operatorname{range}(f) = B$. Nothing is missed. - $f$ is bijective (a bijection, or one-to-one correspondence) if it is both injective and surjective. A perfect pairing.
💡 Intuition: Picture arrows from $A$ to $B$. - Injective: no element of $B$ has two or more arrows pointing at it (at most one arrow in). - Surjective: no element of $B$ has zero arrows pointing at it (at least one arrow in). - Bijective: every element of $B$ has exactly one arrow pointing at it. Inputs and outputs are matched up perfectly — which is exactly why a bijection can be run backward (9.4).
A note on language used everywhere in mathematics: an injection is an injective function; a surjection is a surjective function; a bijection is a bijective function. We will use the noun and adjective interchangeably.
Worked classification
Consider three functions and classify each. Notice that the same rule can change classification when the codomain changes — the codomain is part of the function.
- $f\colon \mathbb{Z} \to \mathbb{Z}$, $f(x) = 2x$. Injective? Yes: if $2a_1 = 2a_2$ then $a_1 = a_2$. **Surjective?** No: the odd number $3$ is never $2x$ for an integer $x$, so $3 \notin \operatorname{range}(f)$. Verdict: injective, not surjective.
- $g\colon \mathbb{Z} \to \mathbb{Z}$, $g(x) = \lfloor x/2 \rfloor$. Injective? No: $g(0) = 0 = g(1)$, two inputs colliding. **Surjective?** Yes: any integer $m$ is hit, e.g. $g(2m) = m$. Verdict: surjective, not injective.
- $h\colon \mathbb{Z} \to \mathbb{Z}$, $h(x) = x + 7$. Injective? Yes: $a_1 + 7 = a_2 + 7 \Rightarrow a_1 = a_2$. **Surjective?** Yes: for any $m$, $h(m-7) = m$. Verdict: bijective.
The strategy for proving injectivity. To prove $f$ is injective, assume two inputs share an output — assume $f(a_1) = f(a_2)$ — and derive that the inputs were equal, $a_1 = a_2$. This is a direct proof of the implication in the definition. (To disprove injectivity, you need just one collision: two distinct inputs with the same output — a single counterexample, in the sense of Chapter 2.)
Claim. $h\colon \mathbb{Z} \to \mathbb{Z}$, $h(x) = x + 7$, is injective.
Proof. Assume $h(a_1) = h(a_2)$ for some integers $a_1, a_2$. By definition of $h$, this means $a_1 + 7 = a_2 + 7$. Subtracting $7$ from both sides gives $a_1 = a_2$. Since $h(a_1) = h(a_2)$ forced $a_1 = a_2$, the function $h$ is injective. $\blacksquare$
The strategy for proving surjectivity. To prove $f$ is surjective, start from an arbitrary target $b \in B$ and exhibit an input $a \in A$ with $f(a) = b$. The creative work is "solving for $a$": treat $f(a) = b$ as an equation in $a$ and produce a witness. (To disprove surjectivity, name one codomain element that is never produced.)
Claim. $h(x) = x + 7$, as above, is surjective.
Proof. Let $b \in \mathbb{Z}$ be arbitrary. Choose $a = b - 7$, which is an integer because the integers are closed under subtraction. Then $h(a) = (b - 7) + 7 = b$. So every $b$ has a preimage, and $h$ is surjective. $\blacksquare$
Having proved both, $h$ is a bijection. We will use this fact in 9.4 to invert it.
def is_injective(f):
return len(set(f.values())) == len(f)
def is_surjective(f, codomain):
return set(f.values()) == set(codomain)
def is_bijective(f, codomain):
return is_injective(f) and is_surjective(f, codomain)
# g: {0,1,2} -> {0,1,2}, a relabeling (permutation) -> bijection.
g = {0: 1, 1: 2, 2: 0}
print(is_injective(g), is_surjective(g, [0, 1, 2]), is_bijective(g, [0, 1, 2]))
# h: {0,1,2} -> {0,1}, h(x) = x mod 2 -> surjective, not injective.
h = {0: 0, 1: 1, 2: 0}
print(is_injective(h), is_surjective(h, [0, 1]), is_bijective(h, [0, 1]))
# Expected output:
# True True True
# False True False
The tests mirror the definitions exactly. Injectivity becomes "the number of distinct outputs equals the number of inputs" — if any two inputs collided, the set of outputs would be smaller than the domain. Surjectivity becomes "the set of outputs equals the codomain." This is theme four — computation and proof are complementary: the code checks the properties on a specific finite function in microseconds, while the proofs above guarantee them for the infinite domain $\mathbb{Z}$, which no amount of testing could ever cover.
🔄 Check Your Understanding 1. Classify $f\colon \mathbb{N} \to \mathbb{N}$, $f(n) = n + 1$ (with $\mathbb{N}$ including 0). Injective? Surjective? Bijective? 2. Give a one-line disproof that $f(x) = x^2$ on $\mathbb{Z} \to \mathbb{Z}$ is injective. 3. Why does changing only the codomain of $f(x) = 2x$ from $\mathbb{Z}$ to the set of even integers change its classification?
Answers
- Injective (if $a+1 = b+1$ then $a=b$) but not surjective: $0$ is never an output, since the smallest output is $f(0)=1$. So not bijective. (This near-miss — injective but not onto, on an infinite set — is the seed of Chapter 10's surprises.) 2. $f(-1) = 1 = f(1)$, but $-1 \neq 1$: a collision. 3. Surjectivity is about the codomain. Onto $\mathbb{Z}$, the value $3$ is missed (not onto); onto the even integers, every target $2k$ has preimage $k$, so it becomes surjective — and now bijective.
🧩 Productive Struggle. Before reading on, decide each of these and prove or disprove: (a) Is $f\colon \mathbb{Z} \to \mathbb{Z}$, $f(x) = x^3$, injective? surjective? (b) Is $f\colon \mathbb{R} \to \mathbb{R}$, $f(x) = x^3$, surjective? (c) Can a function ${1,2,3} \to {1,2,3,4}$ ever be surjective? Write your answers down, then check them against the finite theorem below and Exercise A1.
A finite-set theorem you will reuse
Here is a fact special to finite sets that we will lean on for hashing (this chapter), cardinality (Chapter 10), and the pigeonhole principle (Chapter 17). On infinite sets it is false — and that failure is exactly what makes infinity interesting.
Theorem (finite injective/surjective equivalence). Let $A$ and $B$ be finite sets with $\lvert A \rvert = \lvert B \rvert$, and let $f\colon A \to B$. Then $f$ is injective **if and only if** $f$ is surjective (and hence either condition alone makes $f$ a bijection).
The strategy first. Both directions are really one counting idea: the range of $f$ can have at most $\lvert A \rvert$ elements (one output per input, fewer if there are collisions). We compare that count against $\lvert B \rvert$. When the two sets are the same size, "no collisions" and "no misses" become the same statement. We prove the direction we need most — injective $\Rightarrow$ surjective — directly; the converse is Exercise E2.
Proof (injective $\Rightarrow$ surjective, for $\lvert A \rvert = \lvert B \rvert$ finite). Suppose $f$ is injective. Then distinct inputs give distinct outputs, so the range $f(A) = {f(a) \mid a \in A}$ contains exactly $\lvert A \rvert$ distinct elements — one for each input, with none coinciding. Now $f(A) \subseteq B$, and we have just shown $\lvert f(A) \rvert = \lvert A \rvert = \lvert B \rvert$. A subset of a finite set $B$ that has the same number of elements as $B$ must be all of $B$ (a strictly smaller subset would have strictly fewer elements). Therefore $f(A) = B$, which says exactly that $f$ is surjective. $\blacksquare$
🔗 Connection. The hinge — "an injective map into a set no larger than its domain must collide" — read in reverse is the pigeonhole principle: if you place $\lvert A \rvert$ items into fewer than $\lvert A \rvert$ boxes, some box gets two. That is why hash collisions are unavoidable once you have more keys than buckets, the result we formalize in Chapter 17 and apply to hashing in Chapter 26. Hold onto this; it is also the engine of Chapter 10, where we deliberately drop the finiteness assumption to discover that some infinities are bigger than others.
9.4 Composition and inverses
Functions are most powerful when chained. Composition is how you build a pipeline; an inverse is how you run it backward. Both have exact definitions, and the central theorem of this chapter — which functions can be inverted — turns out to be answered entirely by 9.3's vocabulary.
Composition
Definition (composition). Let $f\colon A \to B$ and $g\colon B \to C$. Their composition is the function $g \circ f\colon A \to C$ defined by $$(g \circ f)(a) = g\big(f(a)\big).$$ Read $g \circ f$ as "$g$ after $f$": apply $f$ first, then feed the result to $g$.
The order is the part everyone fumbles. In the symbol $g \circ f$, the function written on the right ($f$) runs first. The reason is the nesting $g(f(a))$ — you cannot apply $g$ until $f$ has produced its output. For the composition to be defined at all, the codomain of $f$ must match the domain of $g$ (the "pipe fits the socket").
💡 Intuition: It is Unix pipes,
f | g, but written right-to-left in math: data flowsa -> f -> g -> result, and we write the last stage first. Function call syntax agrees with the math, not the pipe:g(f(a))putsgoutermost. Whenever you read $g \circ f$, mentally insert "do $f$, then $g$."
def compose(g, f):
"""Return g . f as a dict, for finite functions where range(f) <= domain(g)."""
return {x: g[f[x]] for x in f}
f = {0: 1, 1: 2, 2: 0}
g = {0: 0, 1: 0, 2: 1}
gf = compose(g, f) # apply f, then g
fg = compose(f, g) # apply g, then f
print(gf)
print(fg)
print(gf == fg) # composition is NOT commutative in general
# Expected output:
# {0: 0, 1: 1, 2: 0}
# {0: 1, 1: 1, 2: 2}
# False
The last line records a fact worth internalizing: composition is not commutative. Doing $f$ then $g$ is generally a different function from doing $g$ then $f$. (It is associative, though: $h \circ (g \circ f) = (h \circ g) \circ f$ — Exercise B3 — which is why we can write long pipelines without parentheses.)
A worked numeric trace makes the order concrete. Let $f\colon \mathbb{R} \to \mathbb{R}$ be $f(x) = 2x$ and $g\colon \mathbb{R} \to \mathbb{R}$ be $g(x) = x + 3$. Then $$(g \circ f)(x) = g(f(x)) = g(2x) = 2x + 3,\qquad (f \circ g)(x) = f(g(x)) = f(x+3) = 2(x+3) = 2x + 6.$$ At $x = 5$: $(g \circ f)(5) = 13$ but $(f \circ g)(5) = 16$ — the same two operations in the opposite order give different answers, because $f$ scales before $g$ shifts in one case and after in the other. This is exactly the bug you create when you normalize-then-clip versus clip-then-normalize a data pipeline in the wrong order: each stage is a function, and the composition order is the difference between correct and corrupt output.
Composition also preserves the properties of 9.3, and the proof is a clean, reusable template.
The strategy first. To show the composition of two injections is an injection, use the injectivity proof template — assume two inputs give the same final output — and then peel the composition apart one layer at a time, applying each function's injectivity in turn. The same "peel one layer" move handles surjectivity (Exercise A4).
Theorem. If $f\colon A \to B$ and $g\colon B \to C$ are both injective, then $g \circ f$ is injective.
Proof. Assume $(g \circ f)(a_1) = (g \circ f)(a_2)$ for some $a_1, a_2 \in A$. By the definition of composition, this says $g\big(f(a_1)\big) = g\big(f(a_2)\big)$. Since $g$ is injective, equal outputs of $g$ force equal inputs, so $f(a_1) = f(a_2)$. Since $f$ is injective, this in turn forces $a_1 = a_2$. Thus $(g \circ f)(a_1) = (g \circ f)(a_2)$ implies $a_1 = a_2$, so $g \circ f$ is injective. $\blacksquare$
Inverses
The identity function $\operatorname{id}_A\colon A \to A$ sends every element to itself, $\operatorname{id}_A(a) = a$. It is the "do nothing" function — the composition equivalent of multiplying by 1. With it we can say precisely what it means to undo a function.
Definition (inverse). A function $f\colon A \to B$ is invertible if there is a function $g\colon B \to A$ such that $$g \circ f = \operatorname{id}_A \quad\text{and}\quad f \circ g = \operatorname{id}_B.$$ Such a $g$ is unique (Exercise A5); we call it the inverse of $f$ and write it $f^{-1}$. It satisfies $f^{-1}(f(a)) = a$ for all $a \in A$ and $f(f^{-1}(b)) = b$ for all $b \in B$.
So $f^{-1}$ is the function that cancels $f$ from both sides. Two conditions are required, not one: $f^{-1}$ must undo $f$ (that handles inputs) and $f$ must undo $f^{-1}$ (that handles outputs). The next theorem is the heart of the chapter: it tells you exactly which functions have an inverse, in the language we already built.
🚪 Threshold Concept: invertibility = bijectivity. A function can be reversed if and only if it is a bijection. This single equivalence reframes inverses (an "undoing" idea) entirely in terms of collisions and coverage (counting ideas), and it is the reason bijections matter so much in the rest of the book. Reversible computation, lossless encodings, one-time pads, the change-of-variable trick in counting (Chapter 16), and the very definition of "same size" for infinite sets (Chapter 10) all rest on it. Internalize the slogan: to invert, bijize.
Theorem (invertibility). A function $f\colon A \to B$ is invertible if and only if $f$ is bijective.
The strategy first. This is an "if and only if," so two directions. ($\Rightarrow$) Assume an inverse exists and squeeze out both properties: injectivity will fall out of $f^{-1} \circ f = \operatorname{id}$, and surjectivity out of $f \circ f^{-1} = \operatorname{id}$. ($\Leftarrow$) Assume $f$ is bijective and build the inverse by hand: surjectivity guarantees each $b$ has a preimage (so we can define $g(b)$), and injectivity guarantees it is unique (so $g$ is well-defined). Watch how each half of "bijective" supplies exactly one half of "well-defined function."
Proof.
($\Rightarrow$) Suppose $f$ is invertible, with inverse $g$ satisfying $g \circ f = \operatorname{id}_A$ and $f \circ g = \operatorname{id}_B$.
Injective: assume $f(a_1) = f(a_2)$. Apply $g$ to both sides: $g(f(a_1)) = g(f(a_2))$. But $g(f(a)) = (g \circ f)(a) = \operatorname{id}_A(a) = a$, so the left side is $a_1$ and the right side is $a_2$. Hence $a_1 = a_2$, and $f$ is injective.
Surjective: let $b \in B$ be arbitrary. Set $a = g(b) \in A$. Then $f(a) = f(g(b)) = (f \circ g)(b) = \operatorname{id}_B(b) = b$. So $b$ has a preimage under $f$, and $f$ is surjective.
Therefore $f$ is bijective.
($\Leftarrow$) Suppose $f$ is bijective. Define a candidate inverse $g\colon B \to A$ as follows: for each $b \in B$, because $f$ is surjective there exists some $a \in A$ with $f(a) = b$, and because $f$ is injective that $a$ is the only such element. Define $g(b)$ to be that unique $a$. This $g$ is a genuine function: every $b \in B$ gets exactly one value (surjectivity gives "at least one," injectivity gives "at most one" — the two halves of the function definition from 9.1).
It remains to check $g$ inverts $f$ on both sides. For any $a \in A$, let $b = f(a)$; then $g(b)$ is the unique preimage of $b$, which is $a$, so $(g \circ f)(a) = g(f(a)) = a = \operatorname{id}_A(a)$. For any $b \in B$, $g(b)$ is by construction an element with $f(g(b)) = b$, so $(f \circ g)(b) = b = \operatorname{id}_B(b)$. Both identities hold, so $g = f^{-1}$ and $f$ is invertible. $\blacksquare$
Notice the elegant bookkeeping in the ($\Leftarrow$) direction: surjectivity supplied existence, injectivity supplied uniqueness, and "existence + uniqueness" is exactly the "exactly one" in the definition of a function. The theorem is really the function definition of 9.1, viewed in a mirror.
For a finite bijection, building $f^{-1}$ is trivial — swap each input/output pair:
def invert(f):
"""Inverse of a bijection f given as a dict: swap keys and values."""
return {v: k for k, v in f.items()}
p = {0: 1, 1: 2, 2: 0} # a permutation (bijection) of {0,1,2}
p_inv = invert(p)
print(p_inv)
# p . p_inv should be the identity on {0,1,2}
identity = {x: p[p_inv[x]] for x in p}
print(identity)
print(identity == {0: 0, 1: 1, 2: 2})
# Expected output:
# {1: 0, 2: 1, 0: 2}
# {0: 0, 1: 1, 2: 2}
# True
The swap only works because $p$ is a bijection. If $p$ collided (non-injective), the swapped dict
would lose a pair — two keys would want the same new key, and a dict keeps only the last. If $p$ missed
a codomain element (non-surjective), the swapped dict would have a gap in its domain. Both failures are
the theorem talking back to you in code. This is theme two — proofs guarantee correctness: the
one-line invert is correct precisely on the inputs the theorem certifies, and silently wrong
otherwise.
⚠️ Common Pitfall: Confusing the inverse function $f^{-1}$ with the preimage $f^{-1}(T)$ from 9.2. The notation collides, but the concepts do not. The preimage $f^{-1}(T)$ is a set and exists for every function; the inverse function $f^{-1}$ is a function and exists only when $f$ is bijective. When $f$ is bijective and $T = \{b\}$ is a single element, the two harmonize: the preimage $f^{-1}(\{b\})$ is the one-element set $\{f^{-1}(b)\}$ containing the inverse's value.
🔄 Check Your Understanding 1. For $f\colon \mathbb{R} \to \mathbb{R}$, $f(x) = 3x + 1$ (a bijection), find $f^{-1}$. 2. In the composition $g \circ f$, which function is applied to the input first? 3. The function $f(x) = x^2$ on $\mathbb{R} \to \mathbb{R}$ is not invertible. Which property fails, and how would restricting the domain and codomain fix it?
Answers
- Solve $y = 3x + 1$ for $x$: $x = (y-1)/3$, so $f^{-1}(y) = (y-1)/3$. (Check: $f^{-1}(f(x)) = (3x+1-1)/3 = x$.) 2. $f$ — the function on the right runs first. 3. *Both* fail on $\mathbb{R}$: it is not injective ($f(-1)=f(1)$) and not surjective (negatives are missed). Restricting to $f\colon [0,\infty) \to [0,\infty)$ makes it a bijection, with inverse $\sqrt{\cdot}$.
9.5 Important functions: floor, ceiling, mod, exponential, log
A handful of functions appear so often in computer science that they deserve their own muscle memory. You have already used several; here we pin down their definitions and the rules that trip people up.
Floor and ceiling
Definition (floor and ceiling). For a real number $x$: - the floor $\lfloor x \rfloor$ is the greatest integer less than or equal to $x$; - the ceiling $\lceil x \rceil$ is the least integer greater than or equal to $x$. Both are functions $\mathbb{R} \to \mathbb{Z}$.
The one rule to burn in: floor and ceiling round toward $-\infty$ and $+\infty$ respectively — not toward zero. For positives this matches intuition ($\lfloor 3.7 \rfloor = 3$, $\lceil 3.7 \rceil = 4$), but for negatives it surprises people: $\lfloor -3.2 \rfloor = -4$ (the greatest integer not exceeding $-3.2$ is $-4$, since $-3$ is bigger than $-3.2$), while $\lceil -3.2 \rceil = -3$.
⚠️ Common Pitfall: Floor is not "chop off the decimals." Truncation rounds toward zero; floor rounds toward $-\infty$. They agree on nonnegative numbers and disagree on negatives: truncating $-3.2$ gives $-3$, but $\lfloor -3.2 \rfloor = -4$. In Python,
math.floor(-3.2)is-4whileint(-3.2)is-3(truncation). Mixing them up is a classic off-by-one source in indexing, binning, and pagination.
Floor and ceiling are everywhere in algorithm analysis: a balanced binary tree on $n$ nodes has height $\lfloor \log_2 n \rfloor$; paginating $n$ items at $k$ per page needs $\lceil n/k \rceil$ pages; binary search probes index $\lfloor (\text{lo}+\text{hi})/2 \rfloor$.
The mod function and the exponential/logarithm pair
You met $a \bmod n$ informally already; as a function, $x \mapsto x \bmod n$ maps $\mathbb{Z}$ onto the finite set $\{0, 1, \dots, n-1\}$ of remainders. It is the workhorse of hashing (9.6) and the entire subject of Chapter 23. Crucially it is surjective but not injective onto $\{0,\dots,n-1\}$ from a domain larger than $n$ — every remainder is achieved (surjective), but infinitely many integers share each remainder (collisions). That single sentence is the mathematical reason hash tables exist.
The exponential $x \mapsto b^x$ (with base $b > 0$, $b \neq 1$) and the logarithm $x \mapsto \log_b x$ are inverse bijections of each other on their natural domains: $\log_b\colon (0, \infty) \to \mathbb{R}$ inverts $b^{(\cdot)}\colon \mathbb{R} \to (0, \infty)$. That is the content of the identities $\log_b(b^x) = x$ and $b^{\log_b x} = x$ — read them as "$\log$ and $\exp$ undo each other," a direct instance of the invertibility theorem from 9.4. Because they are inverse, the law $\log_b(xy) = \log_b x + \log_b y$ is the mirror of $b^{u+v} = b^u b^v$: the logarithm converts multiplication into addition, which is why $\log$ shows up whenever a divide-and-conquer algorithm halves its input (Chapter 19).
import math
print(math.floor(3.7), math.ceil(3.7)) # positive
print(math.floor(-3.2), math.ceil(-3.2)) # negative: floor goes DOWN
# Modular hash into m = 8 buckets; collisions are inevitable (pigeonhole).
def h(k, m=8):
return k % m
keys = [10, 18, 7, 23]
buckets = {}
for k in keys:
buckets.setdefault(h(k), []).append(k)
print(buckets)
# Expected output:
# 3 4
# -4 -3
# {2: [10, 18], 7: [7, 23]}
Look at the bucket output: keys $10$ and $18$ both land in bucket $2$ (both leave remainder $2$ mod $8$), and $7$ and $23$ both land in bucket $7$. Four keys, two collisions — exactly what 9.3's finite theorem and its pigeonhole reading predicted once the key space exceeds the number of buckets. We turn this into a real concern next.
🔄 Check Your Understanding 1. Compute $\lfloor -2.5 \rfloor$, $\lceil -2.5 \rceil$, and $\lfloor 2.5 \rfloor$. 2. How many pages hold $95$ items at $10$ per page? Write the answer with floor or ceiling. 3. Using "$\log$ inverts $\exp$," simplify $\log_2(2^{10})$ and $2^{\log_2 7}$.
Answers
- $\lfloor -2.5 \rfloor = -3$, $\lceil -2.5 \rceil = -2$, $\lfloor 2.5 \rfloor = 2$. 2. $\lceil 95/10 \rceil = \lceil 9.5 \rceil = 10$ pages. 3. $\log_2(2^{10}) = 10$ and $2^{\log_2 7} = 7$ — each function undoes the other.
9.6 Functions in computer science: hashing, pure functions, type signatures
Everything in this chapter cashes out in code. Three connections are worth making explicit, because each is a place where thinking "function" prevents a real bug.
Hash functions: deliberately non-injective
A hash function maps data of arbitrary size (strings, objects, files) to a fixed-size output (an index, or a fixed-length digest). By construction its domain is enormous and its codomain is small, so by the finite theorem of 9.3 it cannot be injective — there must be collisions, distinct inputs with the same hash. This is not a flaw to engineer away; it is a theorem. The art of hashing is making collisions rare and unpredictable, then handling the ones that remain.
🔗 Connection. We met the modular hash $h(k) = k \bmod m$ in 9.5 and watched it collide. Chapter 17 proves collisions unavoidable via the pigeonhole principle, and Chapter 26 builds real hash tables, designs hash functions that spread keys evenly, and introduces cryptographic hashes — functions deliberately made hard to invert. That last idea, a function easy to compute but infeasible to reverse, is a one-way function, and it is the seed of the password storage and digital signatures you will see in Part IV.
Two security-flavored consequences follow directly from this chapter's vocabulary, and both are worth holding in mind now:
- Password hashing relies on non-invertibility. A server stores $\text{hash}(\text{password})$, never the password. Because the hash is (designed to be) not efficiently invertible, an attacker who steals the table cannot read the passwords back out. "Not invertible" — a property from 9.4 — is doing the security work.
- Lossless compression must be injective. A compressor that mapped two different files to the same compressed blob could not be decompressed unambiguously — decompression is the inverse, and by 9.4 an inverse exists only if the map is injective. So every lossless codec is, on the files it accepts, an injection.
Pure functions: the function definition, enforced
A pure function in programming is one whose output depends only on its inputs, with no side effects and no hidden state. That is precisely the mathematical definition from 9.1 restated for code: same input, same output (well-definedness), and an output for every input in the domain (totality). When a function reads a global variable or the clock, it is no longer a function of its stated arguments — the "same input, same output" promise breaks, and with it your ability to reason, cache, parallelize, or test in isolation. Functional programming is, at its core, the discipline of keeping your code's functions actual functions.
💡 Intuition: "Referential transparency" — the property that you may replace a call
f(x)with its value without changing the program — is just well-definedness wearing programmer's clothes. If $f$ is a true function,f(3)is its result, interchangeably, forever. Memoization (caching results) is correct only for pure functions, and now you can say why: a true function has exactly one output per input, so a cached answer never goes stale. Theme three — abstraction is the master tool — is exactly this: recognizing "pure function" and "mathematical function" as the same object lets you import every guarantee from one world into the other.
Type signatures: domain and codomain in the source
When you read def f(x: int) -> str:, you are reading a function's domain (int) and codomain
(str). The type checker verifies the codomain — that f returns a str — but, as in 9.2, it does
not track the range, the subset of strings actually returned. A signature Callable[[A], B] is
literally the statement $f\colon A \to B$. Generics make this vivid: Callable[[T], T] is "a function
from some type to the same type," and the only thing you can write that works for every T without
inspecting the value is the identity function — the very $\operatorname{id}_A$ we needed to define
inverses in 9.4. Reading types as domains and codomains turns a wall of annotations into a precise
contract.
🔄 Check Your Understanding 1. Explain, using a property from this chapter, why a hash function from all 64-bit integers to a 16-bucket table must have collisions. 2. Why is memoizing a function that reads
time.time()a bug? 3. In a type signaturef: Callable[[int], int], what are the domain and codomain, and what does the signature not tell you?
Answers
- The domain ($2^{64}$ values) is far larger than the codomain ($16$ values), so by the finite injective/surjective theorem (a function into a strictly smaller set cannot be injective — pigeonhole) at least two inputs share a bucket. 2. Such a function is not pure — its output depends on the clock, not just its arguments — so "same input, same output" fails and a cached value is wrong on the next call. 3. Domain
int, codomainint; it does not tell you the range (which integers are actually returned) or whetherfis injective/surjective.
Project Checkpoint: a finite-function module for the Toolkit
Chapter 8 gave the Toolkit sets.py. Functions are the next fundamental object, so we add a small
functions.py that represents a finite function as a dict {input: output} and provides the four
operations this chapter is built on: classify (injective / surjective / bijective), compose, and invert.
Chapter 10 will reuse is_bijective as the literal test for "these two sets have the same size," and
later modules compose mappings, so keeping these signatures stable matters.
Create dmtoolkit/functions.py and add:
def is_injective(f):
"""True iff no two inputs map to the same output."""
return len(set(f.values())) == len(f)
def is_surjective(f, codomain):
"""True iff every element of codomain is an output of f."""
return set(f.values()) == set(codomain)
def is_bijective(f, codomain):
"""True iff f is both injective and surjective onto codomain."""
return is_injective(f) and is_surjective(f, codomain)
def compose(g, f):
"""Return g . f (apply f first): (g.f)(x) = g(f(x)). Needs range(f) <= dom(g)."""
return {x: g[f[x]] for x in f}
def inverse(f):
"""Inverse of a bijection f, as a dict. Raises if f is not injective."""
if not is_injective(f):
raise ValueError("only a bijection has an inverse")
return {v: k for k, v in f.items()}
f = {0: 1, 1: 2, 2: 0}
print(is_bijective(f, range(3)))
print(compose(inverse(f), f))
# Expected output:
# True
# {0: 0, 1: 1, 2: 2}
Two design choices echo the chapter's theorems directly. First, is_injective checks that the number
of distinct outputs equals the number of inputs — the operational form of "no collisions" from 9.3.
Second, inverse refuses to run on a non-injective function, raising instead of silently dropping a
pair; that guard is the invertibility theorem of 9.4 enforced in code ("only a bijection has an
inverse"). The demo composes a bijection with its inverse and gets back the identity on $\{0,1,2\}$ —
the defining property $f^{-1} \circ f = \operatorname{id}$, now executable.
Toolkit so far:
logic.py(Chapters 1–3),sets.py(Chapter 8), and the start ofrecurrences.py(Chapter 6), now joined byfunctions.py. Chapter 10 leans onis_bijectiveto reason about cardinality; by the capstone these pieces compose into one library you understand from the ground up.
Summary
A function $f\colon A \to B$ is a relation $f \subseteq A \times B$ assigning to each input exactly one output. Everything else in the chapter elaborates the consequences of that one rule.
| Concept | Definition | One-line test / fact |
|---|---|---|
| Function | each $a \in A$ has exactly one $f(a) \in B$ | totality + single-valuedness |
| Domain / codomain / range | inputs / declared output type / outputs actually hit | $\operatorname{range}(f) \subseteq B$ |
| Image of $S$ | $f(S) = \{f(a) \mid a \in S\}$ | push a set forward |
| Preimage of $T$ | $f^{-1}(T) = \{a \mid f(a) \in T\}$ | a set; exists for every $f$ |
| Injective | $f(a_1)=f(a_2) \rightarrow a_1=a_2$ | no collisions |
| Surjective | $\forall b\,\exists a,\ f(a)=b$ | $\operatorname{range}(f) = B$ |
| Bijective | injective and surjective | a perfect pairing |
| Composition | $(g \circ f)(a) = g(f(a))$ | apply $f$ first; not commutative; associative |
| Invertible | $\exists g$ with $g\circ f=\operatorname{id}_A,\ f\circ g=\operatorname{id}_B$ | iff bijective; then $g = f^{-1}$ unique |
Key results to carry forward:
- Invertibility theorem: $f$ is invertible iff $f$ is bijective. (Surjectivity $\to$ existence of preimages; injectivity $\to$ uniqueness; together $\to$ a well-defined inverse.)
- Composition preserves properties: the composition of two injections is an injection (same for surjections and bijections).
- Finite equivalence: for finite $A, B$ with $\lvert A\rvert = \lvert B\rvert$, injective $\Leftrightarrow$ surjective $\Leftrightarrow$ bijective. (The pigeonhole principle is this in reverse — Chapter 17.)
- Proof templates: injectivity — assume $f(a_1)=f(a_2)$, derive $a_1=a_2$. Surjectivity — take arbitrary $b$, construct an $a$ with $f(a)=b$.
Standard functions: $\lfloor x \rfloor$ / $\lceil x \rceil$ round toward $-\infty$ / $+\infty$ (not toward zero); $x \bmod n$ is surjective-but-not-injective onto $\{0,\dots,n-1\}$; $\log_b$ and $b^{(\cdot)}$ are inverse bijections, turning products into sums.
CS payoffs: hash functions are forced non-injective (collisions are a theorem); lossless
compression must be injective; password hashing relies on non-invertibility; a pure function is the
mathematical function definition enforced; a type signature Callable[[A], B] is $f\colon A \to B$
(checks the codomain, not the range).
Spaced Review
Retrieval keeps knowledge available. Answer from memory before checking.
- (Ch. 8) A function $f\colon A \to B$ is defined as a special subset of which set, and what is the name (from Chapter 8) of the elements of that set?
- (Ch. 8) If $\lvert A \rvert = 3$ and $\lvert B \rvert = 4$ (finite), how many ordered pairs are in $A \times B$, and can any function $f\colon A \to B$ be surjective? Justify with a count.
- (Ch. 6) The composition-of-injections theorem was proved directly. Sketch how you would instead prove "$f^{-1}(f(a)) = a$ for all $a$" for a recursively defined bijection on $\{0,1,\dots,n\}$ by induction — what would the base case and inductive step be?
- (Ch. 6) Recall the loop-invariant method. State an invariant you could use to prove that a loop building the inverse dict (iterating over
f.items()and assigninginv[v] = k) ends withinvequal to $f^{-1}$, assuming $f$ is a bijection.
Answers
- A subset of the Cartesian product $A \times B$; its elements are ordered pairs (Chapter 8).
- $\lvert A \times B \rvert = 3 \cdot 4 = 12$ ordered pairs. No surjection exists: a function from a 3-element domain produces at most 3 distinct outputs, which cannot cover a 4-element codomain (this is the finite count behind 9.3). 3. Base case: the claim holds at the smallest element (e.g. the recursion's non-recursive value). Inductive step: assuming $f^{-1}(f(k)) = k$, show it for $k+1$ using the recursive definition of $f$ and $f^{-1}$ — the recursive call is the inductive hypothesis, exactly as in Chapter 6's recursion–induction correspondence. 4. Invariant (at the top of the loop): "
invcontains exactly the swapped pairs for the items processed so far, i.e.inv[v] == kfor each processed pair(k, v), with no key overwritten." Because $f$ is injective the values $v$ are distinct, so no assignment clobbers an earlier one; at termination every pair has been swapped, soinv$= f^{-1}$.
What's Next
We built the bijection: a function that pairs two sets so perfectly that it can be run backward. So far we have used bijections to invert functions. Chapter 10 turns the idea into a measuring instrument. The radical move is to define "two sets have the same size" as "there is a bijection between them" — a definition that works even when the sets are infinite, where you cannot just count. That single shift leads somewhere astonishing: the integers and the rationals turn out to be the "same size," but the real numbers are strictly bigger, proven by Cantor's diagonal argument. And the consequence for us as programmers is profound — there are more problems than there are programs to solve them, so some problems have no algorithm at all. Functions, it turns out, are how we reach the edge of what computers can do.