# Group Actions

## 1. Groups of Transformations and Permutations

$GL(2,\mathbb{R})$, the set of invertible 2-by-2 matrices with real entries, can be made into a group by considering the matrix multiplication operation as the binary operation on $GL(2,\mathbb{R})$. Matrix multiplication is associative; the identity matrix $% $ serves as the group identity; every invertible matrix is, well, invertible. In this manner, $GL(2,\mathbb{R})$ is a group of matrices.

Recall, however, that every matrix defines a linear transformation. In the case of $GL(2,\mathbb{R})$, every $M \in GL(2,\mathbb{R})$ defines a linear transformation $L_M:\mathbb{R}^2 \to \mathbb{R}^2$ given by the formula $L_M(v) = M \cdot v$, where $\cdot$ represents the matrix-vector multiplication. In other words, every element of the group $GL(2,\mathbb{R})$ transforms 2-vectors to some other 2-vectors. From this angle, $GL(2,\mathbb{R})$ is a group of transformations on $\mathbb{R}^2$.

Similarly, then, we can define the group $GL(n,\mathbb{F})$ of invertible $n$-by-$n$ matrices with entries in a field $\mathbb{F}$ and consider it a group of transformations on $\mathbb{F}^n$, the space of $n$-vectors with entries in $\mathbb{F}$. Generalizing even further, we define the general linear group on a vector space $V$ over a field $\mathbb{F}$, denoted by $GL(V)$, to be the group of all invertible linear transformations $T:V \to V$ with function composition as the group operation. Each element of $GL(V)$ takes vectors in $V$ and transforms them into other vectors in $V$. In other words, $GL(V)$ is a group of transformations on $V$.

But what is a “group of transformations”, really? Every group we have discussed above has one thing in common: each group $G$ comes with some set $X$ on which each element of $G$ can be thought of as a function $f:X \to X$. The most general group of this kind is the symmetric group on a nonempty set $X$, which is the collection $S_X$ of all bijections $f:X \to X$. We require the functions to be bijections, so that $S_X$ with respect to the function composition operation is a bona fide group. Note that the general linear group $GL(V)$ is a subgroup of $S_V$, the symmetric group on the vector space $V$.

Being a bijection, an arbitrary element $f$ of the symmetric group $S_X$ transforms each element of $X$ in a way that preserves $X$. Indeed, if we let $f$ act on the elements of $X$, then we end up with $X$, relabeled. In this sense, the elements of $S_X$ are called permutations on $X$, and subgroups of $S_X$ permutation groups on $X$.

## 2. Cayley’s Theorem; Basic Definitions and Examples

Let us now consider an abstract group $G$. Unlike $GL(V)$ or $S_X$, the group $G$ does not come equipped with an obvious set on which the elements of $G$ can act as permutations. What we do have is a binary operation on $G$, which takes two elements of $G$ and outputs one element of $G$. This is to say that, by fixing one of the two input parameters, the group operation can be thought of as a function on $G$. We formalize this observation as follows:

Theorem 2.1 (Cayley). Every group is isomorphic to a permutation group on $G$.

Proof. Given a $g \in G$, we define the left regular representation $L_g:G \to G$ by setting $L_g(x) = gx$ for each $x \in G$. We note that $L_g$ is a bijection, as $L_g^{-1}$ is the inverse of $L_g$.

Let us now define a left-multiplication map $L:G \to S_G$ by declaring $L(g) = L_g$ for each $g \in G$. We fix $g,h \in G$ and observe that

for all $x \in G$. From this, it follows that $L(gh) = L(g)L(h)$. Since $g$ and $h$ were arbitrary, we conclude that $L$ is a group homomorphism.

It remains to show that $L$ is injective, whence $G$ is isomorphic to $\operatorname{im} L$, a subgroup of $S_G$. To this end, we fix $g,h \in G$ and suppose that $L(g) = L(h)$. This, in particular, implies that

Since the inverse element of a fixed group element is unique, we conclude that $g = h$. This completes the proof of Cayley’s theorem. $\square$

We glean from Cayley’s theorem an important idea: we can make an abstract group $G$ act on a set $X$ by identifying $G$ with a permutation group on $X$.

Definition 2.2. Let $G$ be a group. A permutation representation of $G$ is a group homomorphism $\varphi:G \to S_X$ into the symmetric group $S_X$ on some set $X$. Given a nonempty set $X$, we say that $G$ acts on $X$ if there exists a permutation representation $\varphi:G \to S_X$; in this case, $X$ is referred to as a $G$-set. If, in addition, $\varphi:G \to S_X$ is injective, then we say that $G$ acts faithfully on $X$.

Given a permutation representation $\varphi:G \to S_X$, we often write

to denote $\varphi(g)(x)$. This is the left action notation, which we shall use throughout this post. Also prevalent in the literature is the right action notation, which employs

to denote $\varphi(g)(x)$. We will not use the right action notation in this post.

Here is a simple exercise to get you used to the left action notation.

Exercise 2.3. If $G$ acts on $X$, then the following holds:
(1) $1_G \cdot x = x$ for all $x \in X$.
(2) $g \cdot (h \cdot x) = (gh) \cdot x$ for all $g,h \in G$ and $x \in X$.
Conversely, if (1) and (2) holds, then the mapping $\varphi:G \to S_X$ given by the formula $\varphi(g)(x) = g \cdot x$ is a permutation representation.

We shall therefore use the left action notation and the permutation representation notation interchangeably. In construcitn a group action, we often define a map and check that it is a permutation representation, or define a dot operation and check that properties (1) and (2) hold. If a group action arises from restricting an already-extant functions $f$ on a smaller domain $D$, then we must also check that the range of $f|_D$ is a subset of $D$: see Example 2.10.

Let us now study several examples of group actions. There is no need to peruse all of these on the first read: take a look at a few, move on to Section 3, come back whenever certain examples are cited.

Example 2.4. Let $X$ be a nonempty set. The symmetric group $S_X$ acts faithfully on $X$ via the identity representation $\operatorname{id}:S_X \to S_X$ that sends each $\sigma \in \S_X$ to itself. If $G$ is a subgroup of $S_X$, then the canonical injection map $\iota_G:G \to S_X$ that sends each $g \in G$ to itself is an injective permutation representation of $G$ on $X$.  Therefore, $G$ acts faithfully on $X$.

Now, if $Y$ is a set that contains $X$ as a subset, then we can define an injective group homomorphism $\varphi:S_X \to S_Y$ by setting

In other words, we identify $S_X$ with the subgroup of permutations on $Y$ that permutes $X$ and fixes $Y \smallsetminus X$. It now follows that $S_X$ acts faithfully on $Y$. Moreover, $\varphi \circ \iota_G:G \to S_Y$ is an injective group homomorphism, whence $G$ acts faithfully on $Y$. We summarize the chain of permutation representations as follows:

Example 2.5. In general, if $G$ acts on $X$, then every subgroup $H$ of $G$ acts on $X$. To see this, we take a permutation representaiton $\varphi:G \to X$ and precompose it with the canonical injection map $\iota_H:H \to G$, obtaining another permutation representation $\varphi \circ \iota_H:H \to S_X$. Note that if the action of $G$ on $X$ is faithful, then so is induced action of $H$ on $X$.

Even more generally, if $G$ acts on $X$, and if $\varphi:H \to G$ is a group homomorphism from another group $H$ into $G$, then $H$ acts on $X$. $\square$

Example 2.6. The general linear group $GL(V)$ is a subgroup of the symmetric group $V$, and so $GL(V)$ acts faithfully on $V$. $\square$

Example 2.7 (Linear groups). We introduce a few important subgroups of $GL(n,\mathbb{F})$. Since $GL(n,\mathbb{F})$ is a subgroup of the symmetric group $S_{\mathbb{F}^n}$, Example 2.4 implies that all such subgroups act faithfully on $\mathbb{F}^n$. The special linear group is the subgroup $SL(n,\mathbb{F})$ consisting of matrices of determinant 1. The orthogonal group is the subgroup $O(n,\mathbb{F})$ consisting of matrices $A \in GL(n,\mathbb{F})$ such that $AA^t = A^t A = I_n$, where $t$ denotes the matrix transpose operation and $I_n$ the $n$-by-$n$ identity matrix. The special orthogonal group is the subgroup $SO(n,\mathbb{F}) = O(n,\mathbb{F}) \cap SL(n,\mathbb{F})$.

If $\mathbb{F} = \mathbb{R}$, then we typically write $O(n)$ and $SO(n)$ to denote $O(n,\mathbb{R})$ and $SO(n,\mathbb{R})$, respectively. Both $O(n)$ and $SO(n)$ act faithfully on $\mathbb{R}^n$.

If $\mathbb{F} = \mathbb{C}$, then we define the unitary group $U(n)$ to be the group of matrices $A \in GL(n,\mathbb{C})$ such that $AA^* = A^*A = I_n$, where $*$ denotes the conjugate transpose operation. The special unitary group is the subgroup $SU(n) = U(n) \cap SL(n,\mathbb{C})$. Both $U(n)$ and $SU(n)$ act faithfully on $\mathbb{C}^n$. $\square$

Example 2.8 (Isometry groups, orthogonal groups, and symmetry groups). Let $\|x\|$ denote the usual Euclidean norm of $x \in \mathbb{R}^n$. An isometry on $\mathbb{R}^n$ is a function $f:\mathbb{R}^n \to \mathbb{R}^n$ such that $\|f(x) - f(y)\| = \|x-y\|$ for all $x,y \in \mathbb{R}^n$. We note that isometries are always injective. Indeed, if $f(x) = f(y)$ for some $x,y \in \mathbb{R}^n$, then $0 = \|f(x) - f(y)\| = \|x-y\|$, whence $x=y$.

The key fact is as follows; see, for example, Theorem 6.2.3 in M. Artin’s Algebra for a proof.

Theorem 2.8.1. If $f:\mathbb{R}^n \to \mathbb{R}^n$ is an isometry, then there exists an $M \in O(n)$ such that $f(x) = f(0) + Mx$ for all $x \in \mathbb{R}^n$.

Since the mapping $x \mapsto Mx$ is a bijection, we see that every isometry on $\mathbb{R}^n$ is an element of $S_{\mathbb{R}^n}$, the symmetric group on $\mathbb{R}^n$. It follows from Example 2.4 that any group of isometries on $\mathbb{R}^n$ acts faithfully on $\mathbb{R}^n$. Note also that Theorem 2.8.1 implies that $O(n)$ is, in fact, the group of all isometries $f$ on $\mathbb{R}^n$ such that $f(0) = 0$.

Let us now fix a subset $E$ of $\mathbb{R}^n$. An isometry $f:\mathbb{R}^n \to \mathbb{R}^n$ is said to be a symmetry of $E$ if $f(E) = E$. Since $f|_E:E \to E$ is a restriction of an injective function, it must be injective. The surjectivity is guaranteed by defintion, and so we conclude that a symmetry of $E$ is a permutation on $E$. Therefore, any group of symmetries of $E$ is a subgroup of $S_E$, whence by Example 2.4 it acts faithfully on $E$. In particular, the symmetry group of $E$, which we define below, acts faithfully on $E$:

Definition 2.8.2. Let $E \subseteq \mathbb{R}^n$. The symmetry group of $E$ is the group $G$ of all symmetries of $E$.

For example, $O(n)$ is the symmetry group of the $(n-1)$-sphere

$\mathbb{S}^{n-1} = \{x \in \mathbb{R}^n : \|x\| = 1\}$.

To see this, we first fix $M \in O(n)$. $M$ is invertible, and so the mapping $x \mapsto Mx$ is injective on $\mathbb{S}^{n-1}$. Moreover, for each $y \in \mathbb{S}^{n-1}$, we see that

whence $x=M^{-1}y$ is an element of $\mathbb{S}^{n-1}$ such that $Mx = y$. It follows that the mapping $x \mapsto Mx$ is a symmetry of $\mathbb{S}^{n-1}$. We conclude that the symmetry group of $\mathbb{S}^{n-1}$ includes $O(n)$.

To show the reverse inclusion, we fix a symmetry $f$ of $\mathbb{S}^{n-1}$ and show that $f(x) = Mx$ for some $M \in O(n)$.  It suffices to show that $f(0) = 0$, for then the desired result follows from Theorem 2.8.1.

To this end, we suppose for a contradiction that $f(0) \neq 0$ and let $x = \|f(0)\|^{-1} M^{-1} f(0)$. Since

we see that $x \in \mathbb{S}^{n-1}$. Now, Theorem 2.8.1 implies that

which contradicts the assumption that $f$ is a symmetry of $\mathbb{S}^{n-1}$.  It now follows that $O(n)$ is the symmetry group of $\mathbb{S}^{n-1}$. $\square$

Example 2.9 (Dihedral groups). We have already discussed dihedral groups as abstract groups in my blog post on generators and relations. Here we give an alternate definition that brings forth the geometric nature of the groups. Consider $O(2)$, the orthogonal group of degree 2. We have seen in Example 2.8 that $O(2)$ is the symmetry group of the unit circle $\mathbb{S}^1$. The typical elements of $O(2)$ are the rotation matrices

and the reflection matrices

Note that $T_\theta = R_\theta T_0$. Since

$T_0$ represents reflection with respect to the $x$ axis. Consequently, $T_\theta$ can be thought of as reflection with respect to an appropriate axis of reflection determined by the rotation matrix $R_\theta$.

An important fact about rotation matrices and reflection matrices in $O(2)$ is as follows; see, for example, Section 2.1 of Grove & Benson, Finite Reflection Groups for a proof.

Theorem 2.9.1. $O(2)$ is generated by $\{R_\theta : \theta \in \mathbb{R}\} \cup \{T_0\}$.

We now define the dihedral group of degree $n$, denoted by $D_n$, to be the subgroup of $O(2)$ generated by $\rho = R_{2\pi/n}$ and $\tau = T_0$. Every element of $D_n$ is a symmetry of the regular $n$-gon $G^n$ whose vertices are precisely the elements of the following set:

By Theorem 2.9.1, every symmetry of $G^n$ is a finite product of rotation matrices and $\tau$. Since the symmetry group of $G^n$ cannot contain rotation matrices other than $\rho,\ldots,\rho^{n-1}$, we see that every symmetry of $G^n$ must be a finite product of $\rho$ and $\tau$. The relation $\rho\tau = \tau\rho^{-1} = \tau \rho^{n-1}$ now implies that all such products can be written in the form $\rho^i \tau^j$ for some $i,j \in \mathbb{N}$. All such isometries are already included in the dihedral group, and so we conclude that $D_n$ is the symmetry group of $G^n$. In particular, it follows from Example 2.8 that $D_n$ acts faithfully on $G^n$. $\square$

Example 2.10 (Actions of dihedral groups on other sets). Recall $D_n$, the dihedral group of degree $n$, from Example 2.9. We think of $D_n$ as a subgroup of $O(2)$, so that each element of $D_n$ is a 2-by-2 matrix. For all $k,n \in \mathbb{N}$, we define $v_k^n = ( \cos \frac{2 k \pi}{n}, \sin \frac{2 k \pi}{n} )$. The set

is a subset of $\mathbb{S}^1 = \{x \in \mathbb{R}^2 : \|x\| = 1\}$, the unit circle in $\mathbb{R}^2$.  Let us define a $D_n$-action on $V^n$ by setting $M \cdot v = Mv$ for each $M \in D_n$ and every $v \in V^n$. To check that this is indeed a group action, we must check that the mapping $v \mapsto Mv$ is a bijection on $V^n$. This can be checked explicitly, using rotation matrices and reflection matrices introduced in Example 2.9. $D_n$ can now be thought of as a subgroup of $S_{V^n}$, whence Example 2.4 implies that $D_n$ acts faithfully on $V^n$.

Let us now assume that $n$ is even, and consider the following collection of two-element sets:

The $D_n$-action on $P^n$ given by setting $M \cdot \{v,w\} = \{Mv,Mw\}$ is likewise a group action. Note, however, that rotation by $\pi$ fixes all elements of $X$. It follows that this $D_n$-action is not faithful. $\square$

Example 2.11 (Regular actions). Recall from the proof of Cayley’s theorem that the left-multiplication map $L_g:G \to G$ with respect to $g \in G$  is the map $L_g(h) = gh$. The mapping $L:G \to S_G$ that sends each $g \in G$ to the corresponding left-multiplication map $L_g$ is a permutation representation, called the left regular representation.

Analogously, we define the right-multiplication map $R_g:G \to G$ with respect to $g \in G$ to be the map $R_g(h) = hg^{-1}$. The map $R:G \to S_G$ that sends each $g \in G$ to the corresponding right-multiplication map $R_g$ is a permutation representation, called the right regular representation.

We note that $G$ acts faithfully on itself in both cases. $\square$

Example 2.12 (Regular action on cosets). Let $G$ be a group and $H$ be a subgroup of $G$. We define $G/H = \{xH : x \in G\}$, the collection of all left cosets. We define a $G$-action on $G/H$ by setting $g \cdot xH = (gx)H$. This is indeed a group action, as $1_G \cdot xH = xH$ and

for all $g_1,g_2,x \in G$. $\square$

Example 2.13 (Conjugation on elements). Combining the left-multiplication map $L_g$ and the right-multiplication maps $R_g$ from Example 2.10, we obtain a very important permutation representation. We define the conjugation representation of a group $G$ to be the map $\varphi:G \to S_G$ defined by the formula $\varphi(g)(x) = gxg^{-1}$. Since $gxg^{-1} = (L_g \circ R_g)(x)$, we see at once that each $\varphi(g)$ is a permutation on $G$. Observe that $1_G \cdot x = 1_Gx = x$, and that

It follows that $\varphi$ is a permutation representation. The resulting action is referred to as the conjugation action. In light of this, we call $gxg^{-1}$ the conjugation of $x$ by $g$.

The conjugation action is not necessarily faithful. For example, if $G$ is abelian, then $gxg^{-1} = x$ regardless of the choice of $g \in G$$\square$

Example 2.14 (Conjugation on subgroups). Let $G$ be a group, and let $\mathcal{S}(G)$ be the collection of all subgroups of $G$. We claim that the conjugate subgroup $gHg^{-1} = \{ghg^{-1} : h \in H\}$ of an arbitrary subgroup $H$ of $G$ is also a subgroup of $G$, regardless of our choice of $g \in G$. To see this, we fix $g \in G$. Since $1_G = g1_Gg^{-1}$, the identity element $1_G$ is in $gHg^{-1}$. If $gh_1g^{-1},gh_2g^{-1} \in gHg^{-1}$, then

as $h_1h_2 \in H$. Therefore, $gHg^{-1}$ is closed under group operation. Finally, if $ghg^{-1} \in gHg^{-1}$, then

as $h^{-1} \in H$. We conclude that $gHg^{-1} \leq G$. Let us now define a mapping $\phi:G \to S_{\mathcal{S}(G)}$ by setting $\phi(g)(H) = gHg^{-1}$ for each $g \in G$ and every $H \in \mathcal{S}(G)$. Observe that $\phi(g)$ is a permutation on $\mathcal{S}(G)$, as $\phi(g^{-1})$ is the inverse of $\phi(g)$. Moreover,

and so $\varphi$ is a permutation representation. Similarly as in Example 2.13, the action induced by $\phi$ is not necessarily faithful. For example, if $G$, then $gHg^{-1} = H$ for all $g \in G$ and $H \in \mathcal{S}(G)$. $\square$

## 3. Counting Principles

Let us now turn to the abstract theory of group actions; some applications of the theory presented in this section can be found in Section 4. The two fundamental objects associated with a group action are as follows:

Definition 3.1. Let $G$ act on $X$, and fix $x \in X$. The $G$-orbit of $x$ is

The stabilizer, or theisotropy subgroup, of $x$ is

$G_x = \{g \in G : gx = x\}$.

We remark that $\mathcal{O}(x)$ is a subset of $X$, and that $G_x$ is a subgroup of $G$. Example 3.2. Pick a group $G$ and let $G$ act on itself by conjugation (see Example 2.13). The orbit of $x \in G$ is the conjugacy class of $x$

and the stabilizer of $x$ is the centralizer

Example 3.3. Pick a group $G$ and let $G$ act on the collection of all subgroups $\mathcal{S}(G)$ by conjugation (see Example 2.14). The orbit of $H \in \mathcal{S}(G)$ is the collection of all conjugates

and the stabilizer of $H$ is the normalizer

Exercise 3.4. Write down orbits and stabilizers for actions introduced in Example 2.9, Example 2.10, Example 2.11, and Example 2.12.

How are the orbit and the stabilizer of a point $x$ related? Observe that the coset $h G_x$ can be rewriten as follows, by making the substitution $z = hg$:

Therefore, there is a one-to-one correspondence between the cosets $h G_x$ and the points of the form $hx$. Now, the collection of all points of the form $hx$ is precisely the stabilizer $\mathcal{O}(x)$. We therefore conjecture that $|\mathcal{O}(x)|$ equals the number of cosets $hG_x$. This turns out to be true:

Theorem 3.5 (Orbit-stabilizer theorem). Let a group $G$ act on a set $X$. For each fixed $x \in X$, we have the formula

Proof. Let $G/G_x$ denote the left coset space of $G_x$. We define a function $f:\mathcal{O}(x) \to G/G_x$ by setting $f(gx) = gG_x$. Let us first check that $f$ is well-defined. Indeed, if $gx = hx$, then $x = g^{-1}hx$, and so $g^{-1}h \in G_x$. Therefore, $g^{-1}hG_x = G_x$, whence $gG_x = hG_x$. It follows that $f$ is well-defined.

We claim that $f$ is bijective. If $gG_x = hG_x$, then $G_x = g^{-1}hG_x$, and so $g^{-1}h \in G_x$. This implies that $x = g^{-1}hx$, whence $gx = hx$. It follows that $f$ is injective. Moreover, $f(gx) = gG_x$ for all $g \in G$, and so $f$ is surjective. We conclude that $|\mathcal{O}(x)| = [G:G_x]$. $\square$

The orbit-stabilizer theorem is a counting principle, in the sense that many combinatorial results about finite groups can be derived as simple corollaries. Below, we establish a few.

Corollary 3.6. Let a finite group $G$ act on $X$. For each $x \in X$, we have the identity $|\mathcal{O}(x)| = \frac{|G|}{|G_x|}$. In particular, $|\mathcal{O}(x)|$ divides $|G|$.

Proof. We apply Lagrange’s theorem and the orbit-stabilizer theorem to obtain the following:

Diving through by $|G_x|$, we obtain the desired result. $\square$

Corollary 3.7. Let $G$ be a finite group. For each $x \in G$, the number of conjugates of $x$ equals $[G:C_G(x)]$.

Proof. Apply the orbit-stabilizer theorem to Example 3.2. $\square$

Corollary 3.8. Let $G$ be a finite group. For each $H \leq G$, the number of conjugate subgroups of $H$ equals $[G:N_G(H)]$.

Proof. Apply the orbit-stabilizer theorem to Example 3.3. $\square$

As alluded to above, each of the three corollaries give rise to useful combinatorial results in group theory. Let us first establish a combinatorial restriction on the size of $X$. For this, we need the following result:

Theorem 3.9 (Orbit decomposition theorem). Let a group $G$ act on a set $X$. The collection $\{\mathcal{O}(x) : x \in X\}$ forms a partition of $X$.

Proof. Suppose that $\mathcal{O}(x) \cap \mathcal{O}(y) \neq \varnothing$ and fix $z \in \mathcal{O}(x) \cap \mathcal{O}(y)$. Find $g_0,h_0 \in G$ such that $g_0 \cdot x = z = h_0 \cdot y$. It now follows that $g \cdot x = (gg_0^{-1} h_0) \cdot y$ for all $g \in G$, whence $\mathcal{O}(x) \subseteq \mathcal{O}(y)$. Similarly, we see that $h \cdot y = (hh_0^{-1} g_0) \cdot x$ for all $h \in H$, and so $\mathcal{O}(y) \subseteq \mathcal{O}(x)$. We conclude that $\mathcal{O}(x) = \mathcal{O}(y)$. $\square$

The following combinatorial principle now is an immediate corollary of what we have established thus far.

Corollary 3.10. Let a finite group $G$ act on a finite set $X$. Let $\Gamma$ be a subset of $X$ consisting of one representative from each set in the partition formed by the orbits (see Theorem 3.9). Then

Proof. The first equality is an immediate consequence of Theorem 3.9. The second equality follows from the orbit-stabilizer theorem (Theorem 3.5). The third equality follows from Corollary 3.6. $\square$ When a group $G$ acts on itself, then Corollary 3.10 becomes a statement about the cardinality of the group itself. Let us carefully translate Corollary 3.10 in the context of conjugation action. For this, we need the following result:

Corollary 3.11. The conjugacy classes of a group $G$ form a partition of $G$.

Proof. From Example 3.2, we know that the orbits of the conjugation action of $G$ on itself are precisely the conjugacy classes. The orbit decomposition theorem (Theorem 3.9) now furnishes the desired result. $\square$

We are now ready to adapt Corollary 3.9 to describe groups acting on themselves by conjugation.

Theorem 3.12 (Conjugacy class equation). Let $G$ be a finite group. Let $\Gamma$ be a subset of $G$ consisting of one representative from each set in the partition formed by the conjugacy classes (see Corollary 3.11). Then

where $Z(G) = \{x \in G : gxg^{-1} = x \mbox{ for all } g \in G\}$ is the center of $G$.

Proof. The first equality follows from Corollary 3.10 and Corollary 3.7. The second equality follows from the fact that the conjugacy class of each element in the center is a singleton set. $\square$

The class equation, combined with Corollary 3.8, allows us to establish powerful converses to Lagrange’s theorem, which we establish in the next section. We conclude by establishing a combinatorial restriction on the size of $\Gamma$.

Theorem 3.12 (Burnside’s lemma). Let a finite group $G$ act on a finite set $X$. Let $\Gamma$ be a subset of $X$ consisting of one representative from each set in the partition formed by the orbits (see Theorem 3.9). Then

where $X^g$ denotes the number of elements of $X$ fixed by the action of $g$.

Proof. We first observe that

as there are $|\mathcal{O}(x)|$-many orbits in the equivalence class of $\mathcal{O}(x)$ furnished by the orbit decomposition theorem (Theorem 3.9). For each $g \in G$ and every $x \in X$, we define

We now invoke the orbit-stabilizer theorem (Corollary 3.6) to deduce that

as was to be shown. $\square$

## 4. Sylow’s theorems

While Lagrange’s theorem imposes restrictions on the orders of subgroups and elements of a group, it does not guarantee the existence of a subgroup or an element of a certain order.

Example 4.1. Consider the alternating group $A_4$, which is of order 12. We show that $A_4$ has no subgroup of order 6. To this end, we shall make use of the following preliminary result:

Lemma 4.2. Every subgroup of a group $G$ of index 2 is normal in $G$.

We assume for now that the lemma holds. Suppose for a contradiction that $H$ is a subgroup of $A_4$ of order 6, which must be a normal subgroup of $A_4$ by the lemma. The conjugacy classes of $A_4$ are as follows:

Since $H$ is normal, $H$ must be a union of conjugacy classes. Since $\{e\}$ must be a subset of $H$, this implies that we ought to be able to write $|H|=6$ as a sum of 1 and the cardinalities of conjugacy classes. This is impossible, and so we conclude that $A_4$ has no subgroup of order 6.

It now suffices to establish the lemma. Let $G$ be a group and let $H$ be a subgroup of index 2. Note that the two left cosets of $H$ are $H$ itself and $g_0H$ for an arbitrary $g_0 \in G \smallsetminus H$. Fix $h \in H$ and $g \in G$. If $g \in H$, then $ghg^{-1} \in H$. We therefore suppose that $g \in G \smallsetminus H$. If $g(hg^{-1}) \in gH$, then $hg^{-1}) \in H$, and so $g^{-1} = h^{-1}(hg^{-1})$. This, in particular, implies that $g \in H$, whence $gH \subseteq H$, a contradiction. It follows that $ghg^{-1} \in H$, as was to be shown.

See Brennan and MacHale, “Variations on a Theme: $A_4$ Definitely Has no Subgroup of Order Six!” for alternate proofs. $\square$

When, then, could we establish a converse to Lagrange’s theorem? The solution is intimately tied to the study of $p$-groups, which we now define.

Definition 4.3. Let $p$ be a prime number. A group $G$ is said to be a $p$-group if each element of $G$ is of order $p^n$ for some integer $n$ that depends on the element.

The problem of finding a nontrivial $p$-subgroup of a group $G$ is equivalent to finding an element of $G$ of order $p$. Indeed, if $g \in G$ is an element of order $p$, then the cyclic subgroup $\langle g \rangle$ is a nontrivial $p$-subgroup of $G$. Conversely, if $H$ is a nontrivial $p$-subgroup of $H$, then we can find a non-identity element $g \in H$ of order $p^n$. In this case, $g^{p^{n-1}}$ is an element of order $p$.

Now, in order for a finite group $G$ to have a $p$-subgroup, the order of $G$ must be divisible by $p$. The following result shows that the converse holds.

Theorem 4.4 (Cauchy, 1845). Every finite group $G$ whose order is divisible by a prime number $p$ contains an element of order $p$.

Proof. We first prove the theorem for abelian groups. We shall proceed by induction on $m = |G| / p$. If $m = 1$, then $G$ is a group of prime order, hence cyclic; it follows that $G$ contains an element of order $p$.

Fix $m > 1$ and suppose that every abelian group of order $pn$ for each $% $ contains an element of order $p$. Let $g$ be a non-identity element of $G$. If $|g|$ is divisible by $p$, then some power of $g$ is of order $p$. If not, then $G / \langle g \rangle$ is an abelian group of order $pn$ for some $% $, whence it contains an element $h \langle g \rangle$ of order $p$. Now, the canonical projection $\pi:G \to G/\angle x \angle$ is a homomorphism, and so the order of $h$ must be divisible by $p$, order of $\pi(h) = h\langle x \rangle$. It follows that some power of $h$ must be $p$. By induction, every Aelian group whose order is divisible by $p$ contains an element of order $p$.

We now lift the assumption that $G$ is abelian. Once again, we proceed by induction on $m = |G| / p$. The $m=1$ case is established above. Fix $m > 1$ and suppose that every group of order $pn$ for some $1 \leq n > m$ contains an element of order $p$. Let $g \in G$. If $g \notin Z(G)$, then $[G:C_G(g)]$, the number of conjugates of $g$, is larger than $1$, and so $% $. By induction, $C_G(g)$ must contain an element of order $p$, whence so must $G$. We may therefore assume that $p \nmid |C_G(g)|$ whenever $g \notin Z(G)$. Lagrange’s theorem now implies that $p \mid [G:C_G(g)]$ whenever $g \notin Z(G)$. We now invoke the conjugacy class equation (Theorem 3.12):

Since $|G|$ and all $[G:C_G(g)]$ are divisible by $g$, it follows that $Z(G)$ is divisible by $p$. But $Z(G)$ is abelian, and so $Z(G)$ must contain an element of order $p$. It follows that $G$ contains an element of order $p$. $\square$

Cauchy’s theorem provides a convenient equivalence characterization of finite $p$-groups:

Corollary 4.5. A finite group $G$ is a $p$-group if and only if the order of $G$ is a power of $p$.

Proof. If $|G| = p^n$, then Lagrange’s theorem implies that $G$ is a $p$-group. If $|G|$ is divisible by some other prime $q$, then Cauchy’s theorem implies that $G$ contains an element of order $q$, and so $G$ is not a $p$-group. $\square$

The main result of this section is a substantial strengthening of Cauchy’s theorem, due to Sylow.

Definition 4.6. A Sylow $p$-subgroup of a group $G$ is a maximal $p$-subgroup. In other words, if $P$ is a Sylow $p$-subgroup of $G$, then no other $p$-subgroup of $G$ properly contains $P$.

Sylow’s theorems concern the existence and various other properties of Sylow $p$-subgroups of finite groups whose orders are divisible by $p$.

Theorem 4.7 (Sylow, 1872). Let $G$ be a finite group of order $p^nm$, where $p$ is a prime number and $m$ is an integer relatively prime to $p$. The following holds:

First Sylow theorem. $G$ contains a Sylow $p$-subgroup of order $p^n$.

Second Sylow theorem. All Sylow $p$-subgroups are conjugates of one another. In other words, if $P$ and $Q$ are Sylow $p$-subgroups of $G$, then there exists a $g \in G$ such that $gPg^{-1} = Q$. This, in particular, implies that all Sylow $p$-subgroups are isomorphic to one another.

Third Sylow theorem. Let $n_p$ denote the number of Sylow $p$-subgroups of $G$. We have the following combinatorial restrictions on $n_p$:

• $n_p$ divides $\vert G \vert$;
• $n_p \equiv 1 \mbox{ mod } p$.

Before we prove the Sylow theorems, we make a few observations. Firstly, the second Sylow theorem, in conjunction with the third Sylow theorem and Corollary 4.5, implies that $G$ has a normal subgroup of order $p^n$ if and only if $n_p = 1$. The third Sylow theorem provides two combinatorial restrictions on $n_p$, which can be exploited in various clever ways to determine $n_p$. Secondly, the second Sylow theorem, in conjunction with the first Sylow theorem, implies that every Sylow $p$-subgroup of $G$ is of order $p^n$. In other words, maximality in subgroup inclusion automatically implies maximality in order. Thirdly, the maximality observation, combined with the second Sylow theorem, implies the following structure theorem:

Corollary 4.8 (Frattini’s argument). Let $G$ be a finite group and $K$ a normal subgroup of $G$. If $P$ is a Sylow $p$-subgroup of $K$ for some prime $p$, then

Proof of Corollary. Fix $g \in G$. Since $K$ is normal in $G$, we see that $K = gKg^{-1}$. Now, $gPg^{-1} \leq gKg^{-1} = K$, and so $gPg^{-1}$ is a Sylow $p$-subgroup of $K$ by maximality. The second Sylow theorem therefore furnishes $k \in K$ such that $kPk^{-1} = gPg^{-1}$, so that $P = (k^{-1}g)P(k^{-1})g)^{-1}$. It follows that $k^{-1}g \in N_G(P)$. Since $g \in G$ was arbitrary, the factorization $g = k(k^{-1}g)$ establishes the desired identity. $\square$

Let us now turn to the proof of the Sylow theorems, which are taken from Rotman, An Introduction to the Theory of Groups. As in the statement of the theorem, we assume that $|G| = p^n m$. We shall make use of the following lemma:

Lemma 4.9. Let $P$ be a Sylow $p$-subgroup of a finite group $G$. The following holds: (1) $|N_G(P)/P|$ is relatively prime to $p$; (2) If $g \in G$ is of order $p^k$ for some $k$ and $gPg^{-1} = P$, then $g \in P$.

Proof of lemma. (1) If $|N_G(P)/P|$ is divisible by $p$, then Cauchy’s theorem (Theorem 4.4) furnishes an element $gP$ of order $p$. Therefore, $H = \langle gP \rangle$ is a subgroup of $N_G(P)/P$ of order $p$.

Let $\pi:N_G(P) \to N_G(P)/P$ be the canonical projection map and set $H^* = \pi^{-1}(H)$. If $g,h \in H^*$, then $\pi(gh^{-1}) = \pi(g)\pi(h)^{-1} \in H$, and so $gh^{-1} \in H^*$. It follows that $H^* \leq N_G(P)$. Moreover, the first isomorphism theorem, in conjunction with Lagrange’s theorem, implies that

whence by Corollary 4.5 $H^*$ is a $p$-subgroup of $G$ containing $P$. This contradicts the maximality of $P$, and so we conclude that $|N_G(P)/P|$ fails to be divisible by $p$.

(2) Note that $g' = g^{p^{k-1}}$ is of order $p$.  Since $N_G(P) \leq G$, we see that $g' \in N_G(P)$, and so $g'P \in N_G(P)/P$. If $g' \notin P$, then $|g'| = p$, and so $|gP| = p'$. By Lagrange’s theorem, $|N_G(P)/P|$ must be divisible by $p$, and this contradicts (1). $\square$

Proof of the Second and Third Sylow theorems. Let $X = \{P_1,\ldots,P_k\}$ denote the collection of all conjugates of $P$ in $G$, where $P_1 = P$. Taking a cue from Example 2.14, we define the conjugation action of $G$ on $X$ as follows: $\varphi(g) = (gPg^{-1}$ for each $g \in G$. It is routine to check that $\varphi$ is a group homomorphism from $G$ to $S_X$: a simple modification of the argument in Example 2.14 suffices.

Let $Q$ be an arbitrary Sylow $p$-subgroup of $G$. $Q$ is a subgroup of $G$, and so $Q$ acts on $X$ by conjugation (Example 2.5). By the orbit-stabilizer theorem (Corollary 3.6), the size of each orbit divides $|Q|$, whence by Corollary 4.5 each $|\mathcal{O}(P_i)|$ is a power of $p$.

If $|\mathcal{O}(P_i)| = 1$ for some $1 \leq i \leq k$, then $gP_ig^{-1} = P_i$ for all $g \in Q$. In this case, we must have $Q \leq P_i$ by Lemma 4.9(2). Since $P_i$ is a $p$-group by Corollary 4.5, the maximality of $Q$ implies that $P_i \leq Q$, and so $Q = P_i$. Now, if we take $Q$ to be $P$, then we must have $P \neq P_i$ for all $2 \leq i \leq k$, and so $|\mathcal{O}(P_i)| = p^{t_i}$ for some $t_i \geq 1$ whenever $2 \leq i \leq k$. By the orbit decomposition theorem (Theorem 3.9), $X$ is the disjoint union of distinct orbits, whence $|X|$ equals 1 plus a sum of powers of $p$. In other words, $|X| \equiv 1 \mbox{ mod } p$.

Now, we assume for a contradiction that $Q$ is a Sylow $p$-subgroup of $G$ not included in $X$. We have shown above that $P_i = Q$ whenever $|\mathcal{O}(P_i)| = 1$. Therefore, we must have $|\mathcal{O}(P_i)| = p^{t_i}$ for some $t_i \geq 1$ for all $1 \leq i \leq k$. This, in particular, implies that $|X| \equiv 0 \mbox{ mod } p$, contradicting the 1-mod-p relation. Therefore, all Sylow $p$-subgroups are conjugates of one another. This, in particular, implies that $|X| = n_p$, whence $n_p \equiv 1 \mbox{ mod } p$.

We now invoke the orbit-stabilizer theorem to the conjugation action of $G$ on the set of all subgroups of $G$ to deduce that $[G:N_G(P)] = |X| = n_p$ (Corollary 3.8). Lagrange’s theorem implies that $n_p$ divides $|G|$. This completes the proof of the second and third Sylow theorems. $\square$

Proof of the first Sylow theorem. Cauchy’s theorem (Theorem 4.4) implies that $G$ has a subgroup of order $p$, which is a $p$-subgroup of $G$. This, in particular, shows that $G$ has a Sylow $p$-subgroup $P$. Indeed, if $Q$ is a non-Sylow $p$-subgroup of $G$, then, by definition, there exists a $p$-subgroup $Q'$ of $G$ that properly contains $Q$. Since $|G|$ is finite, this process must terminate in finitely many steps, and we obtain a Sylow $p$-subgroup $P$.

We claim that $[G:P]$ is relatively prime to $p$. Since Corollary 4.5 implies that $|P|$ must be a power of $p$, the claim, in conjunction with Lagrange’s theorem, implies that $[G:P] = m$ and $|P| = p^n$, as was to be shown. To prove the claim, we shall make use of the following lemma:

Lemma 4.9. If $C$ is a finite group and $A \leq B \leq C$, then $[C:A] = [C:B][B:A]$.

Proof of lemma. This is an immediate consequence of the fact that cosets of a subgroup partition a group into equal-sized equivalence classes. $\square$ We let $C = G$, $B = N_G(P)$, and $A = P$ and invoke the lemma to show that both $[G:N_G(P)]$ and $[N_G(P):P]$ are relatively prime to $p$. We apply the orbit-stabilizer theorem to the conjugation action of $G$ on the set of all subgroups of $G$ to deduce that $[G:N_G(P)] = n_p$ (Corollary 3.8). The third Sylow theorem implies that that $n_p \equiv 1 \mbox{ mod } p$, whence $[G:N_G(P)]$ is relatively prime to $p$.

As for $[N_G(P):P]$, we first note that $P \triangleleft N_G(P)$, so that $[N_G(P):P] = |N_G(P)/P|.$ It follows from Lemma 4.9(1) that $|N_G(P)/P|$ is relatively prime to $p$. This concludes the proof of the first Sylow theorem. $\square$

The Sylow theorems play an important role in the classification of finite groups. We give examples of classification results in the next blog post, wherein we study semidirect products and the extension problem.

Tags:

Categories:

Updated: