# Supersingular Isogeny Elliptic-Curve Cryptography (Draft)

43 m

## 1. Algebraic Prerequisites

We start with a brief overview of the tools from abstract algebra we shall need for the study of cryptographic techniques. The exposition here is necessarily brief, and readers are encouraged to consult a standard textbook in abstract algebra for a detailed treatment.

Our overview of abstract algebra focuses on a few generalizations of number systems and their properties as abstract mathematical structures. In particular, we consider generalizations of real and complex numbers, of polynomials, and of integers and modular arithmetic. The section culminates in the complete study of finite fields, a versatile tool in the study of cryptographic techniques.

• 1.1 introduces fields, an abstraction of the number-system concept.
• 1.2 introduces the notion of field characteristics, which plays an important role in the study of finte files.
• 1.3 introduces polynomials with coefficients in a field and establishes various tools from elementary mathematics in this general setting.
• 1.4 introduces the notion of algebraic closure, a generalization of the fundamental theorem of algebra in the context of fields.
• 1.5 introduces Abelian groups, a generalization of integer arithmetic and modular arithmetic.
• 1.6 introduces quotient groups, which formalizes the connection between integer arithmetic and modular arithmetic.
• 1.7 introduces cyclic groups, a particularly simple kind of Abelian groups that serve as building blocks for other Abelian groups.
• 1.8 introduces Cauchy’s theorem on Abelian groups, an existence theorem concerning elements of an Abelian group with a special structural property.
• 1.9 introduces the notion of direct sums of Abelian groups, a way to piece together Abelian groups to construct larger Abelian groups.
• 1.10 introduces quotients of polynomial rings, an adaptation of the quotient-group concept in the context of polynomial rings. This, in particular, allows us to construct new fields from existing ones.
• 1.11 collects together techniques introduced in 1.2, 1.3, 1.9, and 1.10 to classify finite fields completely.

### 1.1. Fields

A field is an abstraction of the number-system concept, such as the real numbers and the complex numbers. Every field $k$ is equipped with an addition operation and a multiplication operation that satisfies the usual arithmetic properties, i.e., commutativity, associativity, invertibility, and distributivity. In particular, a field always has an additive identity $0$ a multiplicative identity $1$.

Formally, a field is a set F equipped with an addition operation and a multiplication operation such that the following properties hold:

• addition is commutative, viz., $a+b=b+a$ for all $a,b \in F$;
• multiplication is commutative, viz., $ab=ba$ for all $a,b \in F$;
• addition is associative, viz., $a+(b+c) = (a+b)+c$ for all $a,b \in F$;
• multiplication is associative, viz., $a(bc) = (ab)c$ for all $a,b,c \in F$;
• addition and multiplication are distributive, viz., $a(b+c) = ab+ac$ and $(a+b)c = ac + bc$ for all $a,b,c \in F$;
• an additive identity $0_F$ exists, so that $a + 0_F = 0_F + a = a$ for all $a \in F$;
• a multiplicative identity $1_F$ exists, so that $a 1_F = 1_F a = a$ for all $a \in F$;
• addition is invertible, viz., each $a \in F$ admits $-a \in F$ such that $a + (-a) = (-a) + a = 0_F$;
• multiplication is invertible, viz., each $a \in F \smallsetminus \{0_F\}$ admits $a^{-1} \in F \smallsetminus \{0_F\}$ such that $aa^{-1} = a^{-1}a = 1_F$.

The set of real numbers $\mathbb{R}$ with usual addition and multiplication is a field. Similarly, the set of complex numbers $\mathbb{C}$ and the set of rational numbers $\mathbb{Q}$ with usual addition and multiplication are also fields.

A field $k_1$ is called a subfield of another field $k_2$ if $k_1 \subseteq k_2$ and if the arithmetic operations of $k_1$ and $k_2$ agree on $k_1$. For example, $\mathbb{Q}$ is a subfield of $\mathbb{R}$, and $\mathbb{R}$ is a subfield of $\mathbb{C}$.

We say that a field $k_2$ is a field extension of another field $k_1$ if $k_1$ is a subfield of $k_2$. For example, $\mathbb{C}$ is a field extension of both $\mathbb{R}$, and $\mathbb{Q}$.

We often speak of structure-preserving mappings in abstract algebra. A field homomorphism from a field $k_1$ to another field $k_2$ is a function $\varphi:k_1 \to k_2$ such that

• $\varphi(0) = 0$,
• $\varphi(1) = 1$,
• $\varphi(x+y) = \varphi(x) + \varphi(y)$ for all $x,y \in k_1$, and
• $\varphi(xy) = \varphi(x)\varphi(y)$ for all $x,y \in k_1$. In other words, $\varphi$ is a field homomorphism if $\varphi$ preserves the algebraic structures of a field.

A field homomorphism $\varphi:k_1 \to k_2$ is said to be a field isomorphism if $\varphi$ is an invertible function. In this case, the inverse function $\varphi^{-1}:k_2 \to k_1$ is also a field isomorphism. We say that two fields $k_1$ and $k_2$ with a field isomorphism $\varphi:k_1 \to k_2$ are said to be isomorphic. Any field-theoretic property of a field is shared by its isomorphic fields, and so we often regard isomorphic fields to be essentially the same.

### 1.2. Field characteristic

Given an element $a$ of a field $k$, we often write $n \cdot a$ as a shorthand for the $n$-fold addition $a + a + \cdots + a$. A field $k$ is of characteristic $p$ in case $p$ is the smallest positive integer such that $p \cdot 1 = 0$ in $k$. If no such positive integer exists, then we say that $k$ is of characteristic zero. $\mathbb{Q}$, $\mathbb{R}$, and $\mathbb{C}$ are fields of characteristic zero.

Let us now construct some fields of finite charateristic. Let $\mathbb{Z}/n\mathbb{Z}$ denote the set of integers equipped with addition and multiplication modulo $n$. Since

for all choices of $k \in \mathbb{Z}$, we see that $\mathbb{Z}/n\mathbb{Z}$ only contains $n$ unique elements (identified modulo $n$).

Now, for each $x \in \mathbb{Z}/n\mathbb{Z}$, Euclidean algorithm furnishes two integers $m_1$ and $m_2$ such that

the greatest common divisor of $x$ and $n$. If $n$ is a prime number, then $\operatorname{gcd}(x,n) = 1$, and so we see that

modulo $n$. It follows that $\mathbb{Z}/n\mathbb{Z}$ is a field if $n$ is prime.

We observe that $\mathbb{Z}/p\mathbb{Z}$ for a prime $p$ is of characteristic $p$. Indeed, $k1 = k$, which does not equal $0$ modulo $p$ for $1 \leq k \leq p-1$. Moreover, $p1 = 0$ modulo $p$.

Let us now show that $\mathbb{Z}/n\mathbb{Z}$ is a field only if $n$ is prime.

Suppose that $n$ is not prime. If $n = 1$, then $\mathbb{Z}/n\mathbb{Z}$ consists of one element and is thus not a field. If $n \geq 2$, then $n$ has at least two prime factors, and so we can find a positive prime number $p$ strictly less than $n$ that divides $n$. We can then find $m \in \mathbb{N}$ such that $n = pm$.

If $\bar{p}\bar{k} = \bar{1}$ for some $k \in \mathbb{Z}$, then $\bar{p}\bar{k} - \bar{1} = \bar{0}$, and so

for some $k' \in \mathbb{Z}$. Since $nk' = pmk'$, we see that

Since $p \geq 2$, it follows that $% $, which contradicts the assumption that $k-mk'$ is an integer. We thus conclude that $\bar{p}$ has no multiplicative inverse in $\mathbb{Z}/n\mathbb{Z}$, and $\mathbb{Z}/n\mathbb{Z}$ fails to be a field.

Generalizing this phenomenon, we now show that every field of positive characteristic must be of prime characteristic.

Let $n$ be the characteristic of a field $k$. We suppose for a contradiction that $n = km$ for some integers $k,m \geq 2$. Since $% $ and $% $, we can find elements $x$ and $y$ of $k$ such that $k \cdot x \neq 0$ and $m \cdot y \neq 0$, respectively. We then have

Now, $k$ is a field, and so both $(k \cdot x)$ and $(m \cdot y)$ have multiplicative inverses. It follows that

% which is evidently absurd. It follows that $n$ must be prime.

We shall build on the above results to classify all finite fields in Section 1.11

### 1.3. Polynomial rings over a field

The polynomial ring over a field $k$ is the set $k[X]$ of polynomials

with coefficients $a_n,a_{n-1},\ldots,a_0$ from $k$, equipped with the usual polynomial addition and multiplication. We remark that $k[X]$ cannot, in general, be a field. To see this, we suppose for a contradiction that the first-degree mononial $X$ has a multiplicative inverse $P(X)$, i.e.,

Substituting $X = 0$, we obtain $0 = 1$, which is absurd. It follows that $X$ does not have a multiplicative inverse.

Many properties of polynomials from elementary mathematics generalize readily to polynomials over a field. For one, it is possible to perform polynomial long division. In this context, dividing $P(X) \in k[X]$ by another polynomial $S(X) \in k[X]$ with $\operatorname{deg}P \geq \operatorname{Q}$ results in the identity

where $% $. This condition implies that $\operatorname{deg} S = \operatorname{deg} P - \operatorname{deg} Q$.

A consequence of polynomial long division is polynomial Euclidean algorithm, which, produces, for each pair of polynomials $P,Q \in k[X]$, a pair of polynomials $S,T \in k[X]$ such that

Here $\operatorname{gcd}(P,Q)$ is the polynomial with the highest degree among the polynomials with leading coefficient 1 that divide both $P$ and $Q$.

Analogous to polynomial factorization from elementary mathematics, a polynomial $P \in k[X]$ is said to be reducible if there are polynomials $Q,R \in k[X]$ such that $Q \not\in k$, $R \not\in k$, and $P(X) = Q(X) R(X)$. A polynomial is said to be irreducible if there are no such polynomials. For example,

and so $X^2 - 1$ is reducible in $\mathbb{R}[X]$. Nevertheless, neither $X+\sqrt{2}$ nor $X-\sqrt{2}$ is an element of $\mathbb{Q}[X]$, and so $X^2 - 1$ is irreducible in $\mathbb{Q}[X]$.

We can also generalize the concept of the root of a polynomial from elementary mathematics. Here, a root of a polynomial $P(X) \in k[X]$ is an element $a$ of $k$ such that $P(a) = 0$ in $k$. We now divide $P(X)$ by $X-a$ to obtain

where $% $. It follows that $R(X)$ must be a constant. Since

it follows that $R(X) = 0$, whence

$P(X) = (X-a)Q(X)$.

A consequence of the above factor theorem is that $P$ cannot have more roots than $\deg P$.

### 1.4. Algebraic closure

Recall that a root of a polynomial $P(X) \in k[X]$ is an element $a$ of the base field $k$ such that $P(a) = 0$. We note that not every polynomial has a root. Indeed, the roots of the $X^2 - 1$ are $\sqrt{2}$ and $-\sqrt{2}$, neither of which is in $\mathbb{Q}$. If every non-constant polynomial in $k[X]$ has a root in $k$, then the base field $k$ is said to be algebraically closed. For example, the complex field $\mathbb{C}$ is algebraically closed by the fundamental theorem of algebra.

Recall now that we were able to obtain the roots of $X^2 - 1$ by considering a larger base field, say, $\mathbb{R}$. In fact, given a field $k$, it is always possible to construct a field extension $k^{\mbox{al}}$ of $k$ that is algebraically closed. We say that $k^{\mbox{al}}$ is an algebraic closure of $k$. For example, $\mathbb{C}$ is an algebraic closure of $\mathbb{R}$.

A polynomial $P(X) \in k[X]$ is said to be separable if its roots are distinct in $k^{\mbox{al}}$. For example, $X^2+1$, an element of $\mathbb{R}[X]$, is a separable polynomial, as its roots are $i$ and $-i$.

A field $k$ is said to be perfect if every irreducible polynomial in $k[X]$ is separable. Examples of perfect fields include fields of characteristic zero, finite fields, and algebraically closed fields. The perfect field condition often serves as a simplifying assumption in field theory.

### 1.5. Abelian groups

Sometimes, we do not need to consider both multiplication and addition. For example, studying the arithmetic of a 12-hour clock only requires hour addition (addition modulo 12). While multiplication modulo 12 makes perfect sense, it does not any meaning in the context of clock arithmetic.

For such a case, we make use of an Abelian group, which comes equipped only with the addition operation. Formally, an Abelian group is a set $A$ with an addition operation $+$ such that

• $a+b = b+a$ for all $a,b \in A$,
• $a+(b+c) = (a+b)+c$ for all $a,b,c \in A$,
• there exists an additive identity $0$ in $A$ such that $a+0 = 0+a = a$ for all $a \in A$, and
• for each $a \in A$, there exists an additive inverse $-a$ in $A$ such that $a+(-a) = (-a) + a = 0$.

$\mathbb{Q}$, $\mathbb{R}$, and $\mathbb{C}$ the usual addition operation are abelian groups. In addition, the set of integers $\mathbb{Z}$ with the usual addition operation is an abelian group.

We say that an abelian group $A$ is a subgroup of another abelian group $B$ if $A \subseteq B$ and if the addition operations of $A$ and $B$ agree on $A$. For example,

where $\leq$ denotes the subgroup relation.

A group homomorphism from an abelian group $A$ to another abelian group $B$ is a function $\varphi:A \to B$ such that

• $\varphi(0) = 0$ and
• $\varphi(a+b) = \varphi(a) + \varphi(b)$ for all $a,b \in A$. A group homomorphism $\varphi:A \to B$ is a group isomorphism if $\varphi$ is an invertible function. In this case, $\varphi^{-1}$ is also a group isomorphism, and we say that $A$ and $B$ are isomorphic as abelian groups.

### 1.6. Quotient groups.

The kernel of a group homomorphism $\varphi:A \to B$ is the subgroup

The kernel $\ker \varphi$ measures how close to an isomorphism $\varphi$ is. Indeed, $\varphi$ is one-to-one if and only if $\ker \varphi = \{0\}$. In general, we have that

In other words, if $a$ and $b$ differ by an element of the kernel, then $a$ and $b$ are essentially the same as far as $\varphi$ is concerned.

In light of this observation, we define the quotient group $A/S$ of an abelian group $A$ by its subgroup $S$ to be the set of cosets

with the addition operation defined by the formula

We note that cosets of $S$ divide $A$ into equal-sized, non-intersecting partitions. Indeed, the size of each coset is precisely the size of $S$. Moreover, if there exists $p \in (a+S) \cap (b+S)$, then

for some $x_1,x_2 \in S$. It follows that $a = b+(x_2 - x_1)$ and $b = a+(x_1 - x_2)$, whence $a+S = b + S$. We conclude that distinct cosets do not intersect.

Cosets allow us to can establish a combinatorial restriction on the structure of Abelian groups. To this end, we define the order of an element $x$ of an Abelian group $A$ to be the smallest positive integer $n$ such that $n \cdot x = 0$, where $n \cdot x$ denotes the $n$-fold sum of $x$. We often write $\vert x \vert$ to denote the order of $x$.

We now fix $x \in A$ and define the subgroup of $A$ generated by $x$:

Here, $n \cdot x$ with $% $ denotes the $\vert n \vert$-fold sum of $(-x)$. Evidently, $\vert \langle x \rangle \vert$ is the order of $x$.

Since cosets of $\langle x \rangle$ divide $A$ into equal-sized, non-intersecting partitions, we see that

In particular, the order of $x$ must divide the cardinality of $A$. This fact is commonly known as Lagrange’s theorem.

Since quotient groups collapse a subgroup into one element by declaring equivalence, we can do away with the kernel of a group homomorphism $\varphi:A \to B$ by considering its induced homomorphism $\bar{\varphi}:A/\ker \varphi \to B$, defined by the formula

Indeed, observe that

whence $\ker \bar{\varphi} = \{\bar{0}\}.$ It now follows that $A/\ker \varphi$ is isomorphic to

a subgroup of $B$. This fact is referred to as the first isomorphism theorem.

### 1.7. Cyclic groups

The first isomorphism theorem can be used to furnish a complete classification of a special type of groups, called cyclic groups. An abelian group $C$ is said to be cyclic if there exists an element $g$, called a generator of $C$, such that every element of $C$ can be written as an $n$-fold sum of $g$ for some $n \geq 1$. The standard example of a cyclic group is $\mathbb{Z}$, the group of integers.

Given a cyclic group $C$, we can construct a group homomorphism $\varphi:\mathbb{Z} \to C$ by setting

where $n \cdot g$ is the $n$-fold sum $g + g + \cdots + g$ of a generator $g$ of $C$. This mapping is necessarily surjective, as all elements of $C$ are of the form $n \cdot g$.

If $\varphi$ is one-to-one, then $\varphi$ is an isomorphism, and so

If $\varphi$ fails to be injective, then the first isomorphism theorem implies that

Now, every subgroup of $\mathbb{Z}$ must be of the form

and so

for some $n \geq 1$. We remark that $\mathbb{Z}/n\mathbb{Z}$ has precisely $n$ elements:

It follows that every cyclic group is isomorphic to either $\mathbb{Z}$ or $\mathbb{Z} / \mathbb{Z}$ for some $n \in \mathbb{Z}$.

We now establish a condition under which an Abelian group must be cyclic: if the size of $A$ is prime, then $A$ must be cyclic. To see this, we recall Lagrange’s theorem from Section 1.6, which states that the order of each element must divide the size of the group. If $\vert A \vert = p$ is prime, then every element of $A$ must be either of order $1$ or $p$. Since only the additive identity $0$ can be of order 1, it follows that at least one element of $A$ is of order $p$, which is a generator of $A$.

### 1.8. Cauchy’s Theorem on Abelian groups

Using the theory of cyclic groups, we can show that every finite Abelian group whose order is dividible by a prime number $p$ contains an element of order $p$. This result, known as Cauchy’s theorem, is to be understood as a partial converse to Lagrange’s theorem.

We proceed by induction on the order of the group. Let $A$ be an Abelian group of order $pm$. If $m=1$, then $A$ is a group of prime order, hence cyclic. (See Section 1.7) It follows that $A$ contains an element of order $p$.

Fix $m > 1$ and suppose that every Abelian group of order $pn$ for each $% $ contains an element of order $p$. Let $a$ be a non-identity element of $A$. If the order of $a$ is divisible by $p$, then $k \cdot a$ for some $k \geq 1$ is of order $p$.

If not, then $G / \langle a \rangle$ is an Abelian group of order $pn$ for some $% $, whence it contains an element $h + \langle a \rangle$ of order $p$. This implies that $(p \cdot h) + \langle a \langle = \langle a \langle$, i.e., $p \cdot h = k \cdot a$ for some $k \geq 1$. We observe that

where $\operatorname{lcd}$ is the least common multiple function. Therefore,

and so $\vert h \vert$ is a multiple of $p \, \operatorname{lcd}(k, \vert a \vert)$. In particular, the primality of $p$ implies that $\vert h \vert = lp$ for some $l \geq 1$, whence $l \cdot h$ is an element of $G$ of order $p$. By induction, every Abelian group whose order is divisible by $p$ contains an element of order $p$.

This completes the proof of Cauchy’s theorem.

### 1.9. Direct sums of Abelian groups

Complex groups can be built from simpler groups by piecing them together. This is the main idea of the direct sum of Abelian groups.

Given Abelian groups $A$ and $B$, their direct sum is the cartesian product

with the addition operation defined coordinatewise:

We note that if $\vert A \vert = n$ and $\vert B \vert = m$, then $\vert A \oplus B\vert = nm$.

The main theorem about direct sums of Abelian groups is that every finite Abelian group is a direct sum of cyclic groups. Specifically, if $A$ is an Abelian group with $n$ elements, then

where $p_1,\ldots,p_k$ are powers of prime numbers such that $p_1 \cdots p_k = n$. The proof of this fact is long and technical, and so we omit it.

We now establish a condition under which the direct sum of two cyclic groups is cyclic. The theorem is commonly referred to as the Chinese remainder theorem and states that

if $m$ and $n$ are relatively prime. Indeed, if $g$ and $h$ are generators of $\mathbb{Z}/m\mathbb{Z}$ and $\mathbb{Z}/n\mathbb{Z}$, respectively, then $(g,h)$ must be of order $mn$ in $\mathbb{Z}/,m\mathbb{Z} \oplus \mathbb{Z}/n\mathbb{Z}$, as $m$ and $n$ are relatively prime. This implies that $(g,h)$ is a generator of $\mathbb{Z}/m\mathbb{Z} \oplus \mathbb{Z}/n\mathbb{Z}$, which is then a cyclic group of order $mn$. Since we have seen in Section 1.7 that there is, up to an isomorphism, only one cyclic group of each order, the isomorphism result follows.

### 1.10. Quotients of polynomial rings

The quotient structure on Abelian groups we have studied in Sections 1.6 through 1.9 can be adopted to study rings of polynomials, introduced in Section 1.3.

Let $k$ be a field. We note that $k[X]$, equipped just with polynomial addition, is an Abelian group. Which subgroups of the additive Abelian group $k[X]$ play nicely with the multiplicative structure of the polynomial ring $k[X]$?

We define the ideal of the polynomial ring $k[X]$ generated by a polynomial $P(X) \in k[X]$ to be the set

We observe that $(P)$ is a subgroup of the additive Abelian group $k[X]$. Moreover, if $Q(x) \in (P)$ and $S(X) \in k[X]$, then the products $QS$ and $SQ$ also belong to $(P)$.

Analogous to the Abelian-group case, we define a coset of $(P)$ to be a set of the form

for some $Q(X) \in k[X]$. We define addition of cosets

and multiplication of cosets

The quotient ring $k[X]/(P)$ is defined to be the set of all cosets of $(P)$, equipped with coset addition and coset multiplication.

$k[X]$ is an Abelian group, and $(P)$ is a subgroup, we see that $k[X]/(P)$ is also an Abelian group. When is $k[X]/(P)$ a field? Answering this question boils down to checking when coset multiplication is invertible.

We recall from Section 1.3 that a polynomial $P(X) \in k[X]$ is irreducible if there are no non-constant polynomials $R(X), S(X) \in k[X]$ such that $P(X) = R(X) S(X)$. We shall show that $k[X] / (P)$ is a field if $P$ is irreducible.

Let $Q \in k[X] \smallsetminus (P)$; in other words, let $Q + (P)$ be a nonzero element of $k[X]/(P)$. The polynomial Euclidean algorithm, introduced in Section 1.3, furnishes polynomials $S,T \in k[X]$ such that

$P$ is irreducible, and $Q$ is not a multiple of $P$, and so $\operatorname{gcd}(P,Q) = 1$. It follows that

the multiplicative identity in $k[X]/Q$. It follows that every nonzero element in $k[X]/(P)$ is invertible, whence $k[X]/(P)$ is a field.

### 1.11. Classification of finite fields

We are finally equipped to exhibit all finite fields, up to isomorphisms.

To start off, we show that every finite field must be of size $p^n$, where $p$ is a prime and $n \geq 1$.

We first remark that every finite field must have positive characteristic. Indeed, if a field $k$ is of characteristic 0, then $\{m \cdot 1: m \geq 1\}$ is an infinite subset of $k$, whence $k$ must be infinite.

We have established in Section 1.2 that every field of positive characteristic must be of prime characteristic. We therefore let $k$ be a finite field of characteristic $p$, a prime. Then

is of size $p$. Now, if we consider $k$ as an Abelian group with respect to its addition operation, then $\langle 1 \rangle$ is a subgroup of size $p$. It follows from Lagrange’s theorem that the size of $k$ must be divisible by $p$.

We now suppose for a contradiction that another prime $q$ also divides the cardinality of $k$. Considering, once again, $k$ as an Abelian group with respect to its addition operation, we invoke Cauchy’s theorem to find an element $x \in k$ of order $q$. This means that $x \neq 0$, and that $q \cdot x = 0$. Since $k$ is of characteristic $p$, we also have that $p \cdot x = 0$.

The Euclidean algorithm furnishes two integers $m_1$ and $m_2$ such that

the greatest common divisor of $p$ and $q$. Since $p$ and $q$ are distinct primes, $\operatorname{gcd}(p,q) = 1$.

It now follows that

which contradicts the fact that $x \neq 0$. We conclude that no other prime divides the cardinality of $k$; in other words, $\vert k \vert = p^n$ for some $n \geq 1$.

We now show that, up to an isomorphism, there is precisely one field of each size. For this reason, we write $\mathbb{F}_{p^n}$ to denote the field of size $p^n$.

To this end, we define the multiplicative group of a field $k$ to be the set

equipped with the multiplication operation. Every element is invertible with respect to the multiplicative identity $1$. And, of course, it follows from the properties of the multiplication operation on $k$ that

• $ab = ba$ for all $a,b \in k^*$, and
• $a(bc) = (ab)c$ for all $a,b,c \in k^*$. Therefore, $k^*$ is an Abelian group, despite the apparent lack of an addition operation.

We recall that every finite Abelian group is a direct sum of cyclic groups. Specifically, we have that

where $q_1,\ldots,q_m$ are powers of prime numbers such that $q_1 \cdots q_m = p^n-1$. This decomposition implies that, if $q$ is the least common multiple of $q_1,\ldots,q_m$, then $x^q = 1$ for all $x \in k^*$.

By the factor theorem, the polynomial $X^q-1$ cannot have more than $q$ roots, and so $q \geq p^n-1$. Now $q = \operatorname{lcd}(q_1,\ldots,q_m) \leq q_1 \cdots q_m = p^n-1$, and so $q \leq p^n-1$. It follows that $q = p^n-1$.

Now, for each $1 \leq i \leq m$, we let $x_i$ be a generator of $\mathbb{Z}/q_i\mathbb{Z}$, so that $\vert x_i \vert = q_i$. The $m$-tuple $(x_1,x_2,\ldots,x_m)$ is an element of $\mathbb{Z}/q_1\mathbb{Z} \oplus \cdots \oplus \mathbb{Z}/q_m\mathbb{Z}$ whose order is $q$. Observe that $q = p^n-1$ implies that $\mathbb{Z}/q_1\mathbb{Z} \oplus \cdots \oplus \mathbb{Z}/q_m\mathbb{Z}$ is a cyclic group. It follows that $k^*$ is a cylic group of order $p^n-1$, whence

We have shown that every field of order $p^n$ can admit only one multiplicative structure, from which it follows that every field of size $p^n$ is isomorphic.

It remains to construct $\mathbb{F}_{p^n}$ for each $p$ and every $n$. We have established in Section 1.2 that

We shall construct $\mathbb{F}_{p^n}$ from $\mathbb{F}_p$ by taking a quotient of the polynomial ring $\mathbb{F}_p[X]$, a method introduced in Section 1.10. We take for granted the existence of an irreducible polynomial $P$ of degree $n$. We omit the long, technical proof. Interested readers can consult, for example, a blog post over at everything2.com.

We claim that $\mathbb{F}_p[X]/(P)$ is a field of size $p^n$. We have shown in Section 1.10 that the irreducibility of $P$ implies that $\mathbb{F}_p[X]/(P)$ is a field. It suffices to show that $\mathbb{F}_p[X]/(P)$ has $p^n$ elements.

Let $S \in \mathbb{F}_p[X]$. If $\deg(Q) \geq n$, then polynomial long division yields

where $% $. Therefore,

whence it suffices to consider polynomials of degree less than $n$. Since there are $p$ possible field elements for each coefficient, we conclude that there are $p^n$ unique polynomials of degree less than $n$. It now follows that the size of $\mathbb{F}_p[X]/(P)$ is $p^n$, as was to be shown.

## 2. Classical Cryptography

### 2.1. Perfect secrecy and the one-time pad

Cryptography is the science of designing systems that can withstand malicious attempts to abuse them. Every cryptographic scenario can be illustrated by the story of security’s inseparable couple, Alice and Bob: Alice and Bob want to send messages to each other, while deterring various unwanted interlocutors, eavesdroppers, and tamperers from participating.

In the simplest model, Alice sends Bob a secret message, and Eve the eavesdropper attempts to decode Alice’s message. Alice’s goal is to encrypt the message in such a way that Bob can decrypt but Eve cannot. The formal study of such communication protocols must begin with the construction of a cryptographic model, which we can then analyze to determine its functionality and security.

The formalization of the Alice–Eve–Bob secnario consists of the following:

• A set $\mathcal{M}$ of messages, and another set $\mathcal{C}$ of ciphertexts;
• Alice has an encryption algorithm $\operatorname{Enc}(m)$ that takes $m \in \mathcal{M}$ and outputs $c \in \mathcal{C}$;
• Bob has a decryption algorithm $\operatorname{Dec}(c)$ that takes $c \in \mathcal{C}$ and outputs $m \in \mathcal{M}$;
• Eve has her own decryption algorithm $\mathcal{A}(c)$ that takes $c \in \mathcal{C}$ and outputs $m \in \mathcal{M}$.

In order for the above model to be functional, we expect to have $\operatorname{Dec}(\operatorname{Enc}(m)) = m$ for all $m \in \mathcal{M}$. Security would mean that $\mathcal{A}(\operatorname{Enc}(m)) \neq m$ for all $m \in \mathcal{M}$, or, at least, for a large portion of $m \in \mathcal{M}$.

This naïve definition of functionality and security quickly turns out to be hopeless. History has shown that one party can often find out which encryption method the other party is using through espionage and other tactics. For Alice to guarantee the security of her communication with Bob, it is imperative for her to use an encryption method that produces messages that cannot be decrypted even when the encryption algorithm is leaked to Eve.

To this end, we construct an encryption method with a secret key, without which encryption messages cannot be decrypted with ease. We update the above cryptographic model accordingly:

• We now have a set $\mathcal{M}$ of messages, a set $\mathcal{C}$ of ciphertexts, and a set $\mathcal{K}$ of keys.
• There is a key-generation algorithm $\operatorname{Gen}$ that outputs an element of $\mathcal{K}$.
• Alice is equipped with an encryption algorithm $\operatorname{Enc}(k,m) = \operatorname{Enc}_k(m)$ that takes $(k,m) \in \mathcal{K} \times \mathcal{M}$ and outputs $c \in \mathcal{C}$.
• Bob is equipped with a decryption algorithm $\operatorname{Dec}(k,c) = \operatorname{Dec}_k(c) = m$ that takes $(k,c) \in \mathcal{K} \times \mathcal{C}$ and outputs $m \in \mathcal{M}$.
• Even is equipped with another decryption algorithm $\mathcal{A}(k,c)$ that takes $(k,c) \in \mathcal{K} \times \mathcal{C}$ and outputs $m \in \mathcal{M}$.

In this context, it is reasonable to say that the model is functional if

for all $(k,m) \in \mathcal{K} \times \mathcal{M}$. In other words, Bob should be able to decrypt any message that Alice encrypted, so long as both of them use the same key.

Is this new model secure? We should hope that Eve cannot, in general, guess the key that Alice and Bob choose to use. Therefore, it is reasonable to assume that $\operatorname{Gen}$ is a randomized algorithm.

Moreover, if Eve fails to obtain the correct key, then she should not be able to recover the original message. For this, a typical ciphertext must not carry any inherent meaning on its own, thereby deterring Eve from deciphering it without a key.

We formalize these observations as follows:

Definition 2.1.1 (Shannon, 1949). A cryptographic system $(\operatorname{Gen}, \operatorname{Enc}, \operatorname{Dec})$ is perfectly secret if, for each probability distribution $\mathcal{D}$ over $\mathcal{M}$, we have the identity

for all choices of $(\bar{m},\bar{c}) \in \mathcal{M} \times \mathcal{C}$. In other words, the probability of recovering a fixed message $\bar{m}$ does not depend on the choice of the ciphertext $\bar{c}$.

An implementation of Shannon’s perfect secrecy model is the one-time pad algorithm. To state the algorithm, we recall from Section 1.10 that $\mathcal{F}_2 = \{0,1\}$ denotes the finite field of size 2.

Theorem 2.1.2 (Shannon one-time pad, 1949). Fix $n \in \mathbb{N}$ and let $\mathcal{M} = \mathcal{C} = \mathcal{K} = \mathbf{F}_2^n$, the $n$-fold cartesian product of $\mathbf{F}_2$. We define $\operatorname{Gen}$ to be the algorithm that chooese $k$ uniformly from $\mathcal{K}$. Let $\oplus$ denote the coordinatewise addition on $\mathbf{F}_2^n$

and define

for all $(m,c,k) \in \mathcal{M} \times \mathcal{C} \times \mathcal{K}$. The resulting system $(\operatorname{Gen}, \operatorname{Enc}, \operatorname{Dec})$, called the one-time pad, is a perfectly secret cryptographic system.

The one-time pad algorithm (OTP) is one of the earliest known examples of encryption methods with a secret key. Miller was the first person to describe OTP formally (Miller 1882). Shannon, on the other hand, was the first to prove formally the security of OTP (Shannon 1949.

OTP is an example of a symmetric-key encryption scheme, as the same key is used both for the encryption process and the decryption process. The above theorem shows that OTP is essentially unbreakable, but OTP is not without problems. We first note that a key $k$ must be just as large as the message $m$ is used. In fact, this condition cannot be altered:

Theorem 2.1.3 (Shannon, 1949). In every perfectly secret cryptographic system, $\vert \mathcal{K} \vert \geq \vert \mathcal{M} \vert$.

Worse, a key can never be recycled. Indeed, if $c_1 = m_1 \oplus k$ and $c_2 = m_2 \oplus k$, then

From $m_1 \oplus m_2$, the individual messages can be obtained with reasonable certainty. A historical example is the Venona project:

One-time pads used properly only once are unbreakable; however, the KGB’s cryptographic material manufacturing center in the Soviet Union apparently reused some of the pages from one-time pads. This provided Arlington Hall with an opening. Very few of the 1942 KGB messages could be solved because there was very little duplication of one-time pad pages in those messages. The situation was more favorable in 1943, even more so in 1944, and the success rate improved accordingly. (Benson 2011)

In short, it is hopeless to aim for perfect secrecy. We shall have to settle for something less while maintaining reasonable level of security, which is the underlying theme of cryptography.

We conclude the section with proofs of the above theorems. To this end, we shall make use of a technical lemma.

Lemma 2.1.4. The perfect secrecy condition holds if and only if

for all choices of $m_0,m_1 \in \mathcal{M}$ and $\bar{c} \in \mathcal{C}$. In other words, the probabilities of obtaining the same ciphertext $\bar{c}$ from two different messages $m_0$ and $m_1$ are the same.

We defer the proof of the lemma and establish the theorems.

For OTP,

for all choices of $(m,k) \in \mathcal{K}$, whence OTP is functional.

To check that OTP is ecure, we observe that, for an arbitrary choice of a probability distribution $\mathcal{D}$ over $\mathcal{M}$,

regardless of the choice of $(\bar{m},\bar{c}) \in \mathcal{M} \times \mathcal{C}$. Therefore,

for all choices of $m_0,m_1 \in \mathcal{M}$ and $\bar{c} \in \mathcal{C}$. Perfect secrecy of OTP now follows from the lemma.

To show that any perfectly secret cryptographic system must have keys at least as large as the message, we suppose that $% $. We fix $(m_0,k_0) \in \mathcal{M} \times \mathcal{K}$, set $c_0 = \operatorname{Enc}_{k_0}(m_0)$ and consider the set

Since

we can find $m_1 \in \mathcal{M} \smallsetminus \mathcal{N}$. Now,

but

and so it follows from the lemma that the cryptographic system in question is not perfectly secret.

Let us return to the proof of the technical lemma, which we restate below for convenience.

Lemma 2.1.4. The perfect secrecy condition holds if and only if

for all choices of $m_0,m_1 \in \mathcal{M}$ and $\bar{c} \in \mathcal{C}$. In other words, the probabilities of obtaining the same ciphertext $\bar{c}$ from two different messages $m_0$ and $m_1$ are the same.

For any choice of $(\bar{m},\bar{c}) \in \mathcal{M} \times \mathcal{C}$, we have that

Therefore,

if and only if

To this end, we observe that

and that

Since probabilities are always nonnegative,

if and only if

regardless of the choice of $m_0 \in \mathcal{M}$. This holds if and only if

and the proof of the lemma is now complete.

### 2.2. Computational complexity

We have seen in Section 2.1 that perfect secrecy is exceedingly difficult to achieve. Nevertheless, it is merely sufficient to prevent Eve the eavesdropper from efficiently decrypting the message. To formalize this notion, we must make sense of what it means for Eve to be computationally bounded.

Computational boundedness is, in essence, the restriction on the ability to carry out computations that take too long to run. What, then is computation? We turn to Alan Turing, the father of theoretical computer science (Turing 1950):

The idea behind digital computers may be explained by saying that these machines are intended to carry out any operations which could be done by a human computer. The human computer is supposed to be following fixed rules; he has no authority to deviate from them in any detail. We may suppose that these rules are supplied in a book, which is altered whenever he is put on to a new job. He has also an unlimited supply of paper on which he does his calculations. He may also do his multiplications and additions on a “desk machine,” but this is not important.

If we use the above explanation as a definition we shall be in danger of circularity of argument. We avoid this by giving an outline of the means by which the desired effect is achieved. A digital computer can usually be regarded as consisting of three pats:

• Store
• Executive unit
• Control

The store is a store of information, and corresponds to the human computer’s paper, where this is the paper on which he does his calculations or that on which his book of rules is printed. In so far as the human computer does calculations in his bead a part of the store will correspond to his memory.

The executive unit is the part which carries out the various individual operations involved in a calculation. What these individual operations are will vary from machine to machine. Usually fairly lengthy operations can be done such as “Multiplicy 3540675445 by 7076345687” but in some machines only very simple ones such as “Write down 0” are possible.

We have mentioned that the “book of rules” supplied to the computer is replaced in the machine by a part of the store. It is then called the “table of instructions.” It is the duty of the control to see that these instructions are obeyed correctly and in the right order. The control is so constructed that this necessarily happens.

In short, computation can be modeled with a theoretical machine consisting of three parts:

• infinitely-long tapes with discrete cells, one of which containing input values and the rest providing read-write workspace areas,
• a state register that specifies the state of the machine at each step, and
• heads that can, in accordance with the state of the machine and the recorded symbols on tapes, read and write symbols on tapes, as well as moving tapes left or right one cell at a time.

On such a model, computational boundedness is merely a restriction on how many times the tapes can be moved before a computational task is considered too difficult. It is, then, worthwhile to formalize the model into a precise mathematical construct:

Definition 2.2.1 (Turing, 1950). For a fixed $k \in \mathbb{N}$, a $k$-tape Turing machine is defined to be an ordered triple $M = (\Gamma, Q, \delta)$ consisting of the following:

• A set $\Gamma$ of symbols, called the alphabet of $M$. Typically, we assume that $\Gamma$ contains, at least, 0, 1, blank, and start.
• A set $Q$ of states for $M$. Typically, we assume that $Q$ contains, at least, start and halt.
• A transition function

that, based on the current state of $M$ and the $k$ symbols at the current locations of the heads, produces output to be recorded on the $k-1$ workspace tapes and moving instructions for the $k$ heads.

We are, of course, interested in computational processes that terminate in a finite number of steps.

Definition 2.2.2. A $k$-tape Turing Machine $M = (\Gamma,Q,\delta)$ is said to be a deterministic algorithm, or simply an algorithm, if, for each input $x = (\texttt{start},\gamma_1,\ldots,\gamma_k) \in Q \times \Gamma^k$ that starts off $M$ at the start state, there exists a positive integer $n$ such that

produces the halt state. In other words, a deterministic algorithm always halts after following the instructions provided by the transition function finitely many times, regardless of the initial configuration of the input tapes.

The image to keep in mind is as follows: we put $k$ tapes into the Turing machine $M$ as the input, $M$ modifies the tapes until it hits the halt state, and the final configuration of the tapes is printed as the output.

We recall, however, that we are interested in computational efficiency, not mere computability. In order to formalize the notion of computational boundedness, we must work out what it means for an algorithm to have a certain running time.

It is typical to consider bit strings such as

as representations of data. Therefore, it makes sense to be able to refer to the collection of all finite bit strings:

Definition 2.2.3. Let $\{0,1\}^n$ denote the $n$-fold cartesian product of the bit set $\{0,1\}$. We define

the set of all finite bit strings. Given a bit string $x \in \{0,1\}^*$, we define its length $\vert x \vert$ to be the unique $n \in \mathbb{N}$ such that

With bit strings as our representation of data, it makes sense to think of a computational task as a function on $\{0,1\}^*$, i.e., a process that outputs a unique bit string for each input bit string. This turns out to be a sufficient abstraction for defining the notion of computational efficiency.

Definition 2.2.4. Let $f:\{0,1\}^* \to \{0,1\}^*$ and $T:\mathbb{N} \to \mathbb{N}$. We say that a Turing machine $M$ computes $f$ in $T$-time if, for every $x \in \{0,1\}^*$, the Turing machine $M$ initialized to the start configuration on input $x$ halts with $f(x)$ as its output within $T(\vert x \vert)$ steps.

In other words, we can use a function $T$ to provide an upper bound on the run time of a Turing machine $M$ computing $f$. Computational tasks in real life often takes longer with larger input data, and so it makes sense to have the $T$-time depends on the size $\vert x \vert$ of the input bit string.

In fact, it would make sense to have $T$ grow with its input size, at a rate sufficiently fast that the Turing machine $M$ is always given the time to read the input. Moreover, we would want $T$ itself to be efficiently computable as well, for otherwise we cannot make use of the information on computational boundedness with ease. We collect these desirable properties into the following definition:

Definition 2.2.5. A function $T:\mathbb{N} \to \mathbb{N}$ is time constructible if $T(n) \geq n$ for all $n \in \mathbb{N}$ and if there exists a Turing machine that computes the function $x \mapsto \operatorname{bin}(T(x))$, the binary representation of $T(x)$, in $T$-time.

Examples of time constructible functions include $n \log n$, $n^3$, and $2^n$.

Let us now define what it means for a function to be efficiently computable.

Definition 2.2.6. We define $\mathsf{poly}$ to be the set of all time-constructible functions $T:\mathbb{N} \to \mathbb{N}$ such that $T(n) = O(n^c)$ for some $c > 0$. A function $f:\{0,1\}^* \to \{0,1\}^*$ is said to be computable in polynomial time, or efficiently computable, if there exists a Turing machine $M$ and a function $T \in \mathsf{poly}$ such that $M$ computes $f$ in $T$-time.

We often write $\mathsf{poly}(n)$ to denote a fixed, unspecified element of $\mathsf{poly}$.

Examples of efficiently computable functions include the usual sorting algorithms, primality testing, Fast Fourier transform, and so on.

It is often useful to allow our model for computation to make use of randomness. For example, quicksort with a randomized pivot often performs a lot better than quicksort with a median-of-medians pivot, even though the latter has better worst-case runtime than the former. Some widely-used computational methods, such as the Monte Carlo methods, are always probabilistic in nature and do not have non-probabilistic analogues.

In light of this, we define a probabilistic analogue of the Turing machine.

Definition 2.2.7 (de Leeuw–Moore–Shannon–Shapiro, 1970). A probabilistic Turing machine is an ordered quadruple $M = (\Gamma,Q,\delta_1,\delta_2)$ consisting of a set $\Gamma$ of symbols, a set $Q$ of states, and two transition functions $\delta_1$ and $\delta_2$: see Definition 2.2.1. Given an input, probabilistic Turing machine $M$ is executed by randomly applying $\delta_1$ and $\delta_2$ with equal probability.

Probabilistic Turing machine $M$ is said to be a probabilistic algorithm if, for each input

there exists a positive integer $n$ such that

produces the halt state. Here, each $i_k$ is randomly chosen to be either 1 or 2.

The concept of computational efficiency for Turing machines can be carried over to the context of probabilistic Turing machines with minor modifications.

Definition 2.2.8. A probabilistic Turing machine $M$ computes $f:\{0,1\}^* \to \{0,1\}^*$ in $T$-time for some $T:\mathbb{N} \to \mathbb{N}$ if, for every choice of bit string $x \in \{0,1\}^*$, the probabilistic Turing machine $M$ initialized to the start state on input $x$ halts after at most $T(\vert x \vert)$ steps with $f(x)$ as the output, regardless of the random choices made within $M$.

Definition 2.2.9. A function $f:\{0,1\}^* \to \{0,1\}^*$ is said to be computable in probabilistic polynomial time if there exists a probabilistic Turing machine $M$ and a function $T \in \mathsf{poly}$ such that $M$ computes $f$ in $T$-time.

Regular Turing machines are sometimes called deterministic Turing machines to emphasize their difference with probabilistic Turing machines. Similarly, computability in polynomial time is often referred to as deterministic polynomial time.

### 2.3. One-way functions

With the language of computational complexity theory at hand, we now turn to a central question of cryptography: how do we create an encryption scheme that is difficult to revert without authorized access?

We begin tackling this question by first formalizing the notion of a process that is easy to carry out but difficult to revert. To this end, we introduce two preliminary definitions.

Definition 2.3.1. $U_n$ denotes a random variable distributed uniformly over $\{0,1\}^n$, i.e.,

whenever $\alpha \in \{0,1\}^n$ and equals zero otherwise.

Definition 2.3.2. $0^n$ refers to a bit string of length $n$, consisting entirely of 0. Similarly, $1^n$ refers to a bit string of length $n$, consisting entirely of 1.

We are now ready to give the definition of a one-way function.

Definition 2.3.2 (Diffie–Hellman, 1976). A function $f:\{0,1\}^* \to \{0,1\}^*$ is said to be one-way if the following conditions hold:

• $f$ is easy to compute, i.e., $f$ is computable in deterministic polynomial time.
• $f$ is difficult to invert, i.e., for each probabilistic polynomial-time algorithm $\mathcal{A}$ and every polynomial $p$,

for all sufficiently large $n$.

Why is the auxiliary input $1^n$ needed? Without it, a function can be considered one-way by merely shrinking its input: if the image is very small, an inverting algorithm simply would not have enough time with respect to the size of its input—i.e., the shrunk output of the original function—to have good computational complexity.

The existence of a one-way function has not been proven. In fact, an existence proof would settle the famous P versus NP problem. There are, however, plausible candidates for one-way functions, having withstood many attempts at producing efficient inverting algorithms.

The most famous example is the integer factorization problem, which is widely believed to be difficult. State-of-the-art factoring algorithms such as general number field sieve runs in subexponential time. In the language of one-way functions, the multiplication function

is conjectured to be a one-way function.

Categories:

Updated: